Subject: repne ins[bwl] vs rep ins[bwl]
To: None <port-i386@netbsd.org>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: port-i386
Date: 01/22/2006 16:24:08
--vtzGhvizbBRQ85DL
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Hi,
while working on Xen3 support, I've run into a problem I don't fully
understand. Basically, insw() doesn't read data while a C loop around
inw() works as expected.
Here is the interesting part of a ddb section.
port to read is 0x1f0, len is 512, so 0x100 words to read. Target is
0xc6318b64.
db> sh reg
ds 0x11
es 0x11
fs 0x31
gs 0x11
edi 0xc6318b64
esi 0xc6318b64
ebp 0xc6318a2c
ebx 0xc0b0ab80
edx 0x1f0
ecx 0x100
eax 0
eip 0xc01b85d6 wdc_datain_pio+0x106
cs 0x9
eflags 0x206
esp 0xc63189f4
ss 0x11
netbsd:wdc_datain_pio+0x106: cld
db> s
Stopped in pid 2.1 (atabus0) at netbsd:wdc_datain_pio+0x107: repne insw
%
dx,%es:(%edi)
db>
Stopped in pid 2.1 (atabus0) at netbsd:wdc_datain_pio+0xc0: xorl %edi,%ed
i
db> sh r
ds 0x11
es 0x11
fs 0x31
gs 0x11
edi 0xc6318b66
esi 0xc6318b64
ebp 0xc6318a2c
ebx 0xc0b0ab80
edx 0x1f0
ecx 0x100
eax 0
eip 0xc01b8590 wdc_datain_pio+0xc0
cs 0x9
eflags 0x206
esp 0xc63189f4
ss 0x11
netbsd:wdc_datain_pio+0xc0: xorl %edi,%edi
db> s
As you can see, the target (%edi) was incremented by only one word, and
the counter (%ecx) was not touched. Looking at the target buffer, the
first 2 bytes contains the expected values, while the remaining are left
untouched (all 0, while the 3rd byte should be 0xff). The device (an IDE
drive) is still in a state where is expect the host to read data (DRQ asserted
in the status register).
Now I see we're doing a repne insw here. From the intel docs, only rep
is allowed with {in,out}s{b,w,l} instructions. So I remplaced repne with
rep in sys/arch/x86/include/pio.h (see attached diff), and this made things
work properly:
db> sh re
ds 0x11
es 0x11
fs 0x31
gs 0x11
edi 0xc6318b64
esi 0xc6318b64
ebp 0xc6318a2c
ebx 0xc0b0ab80
edx 0x1f0
ecx 0x100
eax 0
eip 0xc01b85d7 wdc_datain_pio+0x107
cs 0x9
eflags 0x206
esp 0xc63189f4
ss 0x11
netbsd:wdc_datain_pio+0x107: repe insw %dx,%es:(%edi)
db> s
Stopped in pid 2.1 (atabus0) at netbsd:wdc_datain_pio+0xc0: xorl %edi,%ed
i
db> sh re
ds 0x11
es 0x11
fs 0x31
gs 0x11
edi 0xc6318d64
esi 0xc6318b64
ebp 0xc6318a2c
ebx 0xc0b0ab80
edx 0x1f0
ecx 0
eax 0
eip 0xc01b8590 wdc_datain_pio+0xc0
cs 0x9
eflags 0x206
esp 0xc63189f4
ss 0x11
Now %edi is incremented by the expected value, %ecx is 0 and the data are
read properly. The IDE driver works. This change also made the ahc driver
work, while it would panic in setup phase with the original pio.h.
I tested a kernel on Xen2, and a native i386 kernel with this patch, and
they work properly.
Linux also use rep, not repne.
Now I don't know why repne doesn't work with Xen3 while it works otherwise.
Maybe it's interrupt-related, and Xen3 doesn't restart a repne ins properly
(although, from the Intel docs, and interrupted rep should be restarted
automatically by the processor).
In case this matters, all this is on a:
cpu0 at mainbus0: (uniprocessor)
cpu0: Intel Pentium II (686-class), 447.51 MHz, id 0x652
cpu0: features 183fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
cpu0: features 183fbff<PGE,MCA,CMOV,PAT,PSE36,MMX>
cpu0: features 183fbff<FXSR>
cpu0: I-cache 16 KB 32B/line 4-way, D-cache 16 KB 32B/line 4-way
cpu0: L2 cache 512 KB 32B/line 4-way
cpu0: ITLB 32 4 KB entries 4-way, 2 4 MB entries fully associative
cpu0: DTLB 64 4 KB entries 4-way, 8 4 MB entries 4-way
cpu0: 32 page colors
hypervisor0 at mainbus0
Does anyone know if my patch can cause problems on some CPUs ? I don't think
so, as linux doens't seem to have special-case for this.
Does anyone object if I commit this patch ?
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
--vtzGhvizbBRQ85DL
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=diff
Index: include/pio.h
===================================================================
RCS file: /cvsroot/src/sys/arch/x86/include/pio.h,v
retrieving revision 1.3
diff -u -r1.3 pio.h
--- include/pio.h 24 Dec 2005 20:07:42 -0000 1.3
+++ include/pio.h 22 Jan 2006 14:49:07 -0000
@@ -84,7 +84,7 @@
{
void *dummy1;
int dummy2;
- __asm volatile("cld\n\trepne\n\tinsb" :
+ __asm volatile("cld\n\trep\n\tinsb" :
"=D" (dummy1), "=c" (dummy2) :
"d" (port), "0" (addr), "1" (cnt) :
"memory");
@@ -114,7 +114,7 @@
{
void *dummy1;
int dummy2;
- __asm volatile("cld\n\trepne\n\tinsw" :
+ __asm volatile("cld\n\trep\n\tinsw" :
"=D" (dummy1), "=c" (dummy2) :
"d" (port), "0" (addr), "1" (cnt) :
"memory");
@@ -144,7 +144,7 @@
{
void *dummy1;
int dummy2;
- __asm volatile("cld\n\trepne\n\tinsl" :
+ __asm volatile("cld\n\trep\n\tinsl" :
"=D" (dummy1), "=c" (dummy2) :
"d" (port), "0" (addr), "1" (cnt) :
"memory");
@@ -171,7 +171,7 @@
{
void *dummy1;
int dummy2;
- __asm volatile("cld\n\trepne\n\toutsb" :
+ __asm volatile("cld\n\trep\n\toutsb" :
"=S" (dummy1), "=c" (dummy2) :
"d" (port), "0" (addr), "1" (cnt));
}
@@ -197,7 +197,7 @@
{
void *dummy1;
int dummy2;
- __asm volatile("cld\n\trepne\n\toutsw" :
+ __asm volatile("cld\n\trep\n\toutsw" :
"=S" (dummy1), "=c" (dummy2) :
"d" (port), "0" (addr), "1" (cnt));
}
@@ -223,7 +223,7 @@
{
void *dummy1;
int dummy2;
- __asm volatile("cld\n\trepne\n\toutsl" :
+ __asm volatile("cld\n\trep\n\toutsl" :
"=S" (dummy1), "=c" (dummy2) :
"d" (port), "0" (addr), "1" (cnt));
}
--vtzGhvizbBRQ85DL--