Subject: Re: HEAD instability on Xen
To: Andrew Doran <ad@NetBSD.org>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: port-xen
Date: 11/19/2007 12:00:00
On Sun, Nov 18, 2007 at 11:12:38PM +0100, Manuel Bouyer wrote:
> On Sun, Nov 18, 2007 at 10:08:03PM +0000, Andrew Doran wrote:
> >
> > Acutally, it already has all the necessary changes. Is there a point where
> > the kernel is entered with Xen that the direction flag might not be cleared?
>
> I'll have a look. Could it be a missing lock or splxx() in the pmap ?
>
> > I can't see one. It is cleared for the copyout() in the traceback that you
> > posted.
> >
> > > I've also seen it in the pool code. It was always handling a trap after a
> > > copyin or copyout though.
> >
> > I did many hours of low-memory stress testing on the updated pool code
> > before checking it in, so I don't believe that there is an (obvious) problem
> > there. It could be perhaps be related to the removal of the _CPU options.
>
> No, I've seen it with a kernel from before the _CPU options removal
> (the bouyer-xenamd64-base2 tag in src/sys).
I've tracked it down to a change between 2007.11.05.10.25.03 and
2007.11.08.10.25.03 on HEAD. Here's another instance of the panic (with
2007.11.08.10.25.03):
Starting xend.
uvm_fault(0xc095b880, 0xc8001000, 2) -> 0xe
fatal page fault in supervisor mode
trap type 6 code 2 eip c04da68d cs 9 eflags 10246 cr2 0 ilevel 0
kernel: supervisor trap page fault, code=0
Stopped in pid 169.1 (python2.4) at netbsd:mutex_enter+0xd: cmpxchgl %ecx,0(%edx)
db> tr
mutex_enter(c5d90ac4,8063000,2,0,c04d94a9) at netbsd:mutex_enter+0xd
trap() at netbsd:trap+0x415
--- trap (number 6) ---
i486_copyout(c5d90ac4,c752a000,8063000,4c8,c5d90ac4) at netbsd:i486_copyout+0x40
uiomove(c752a000,4c8,c7d9f98c,c7d9f86c,0) at netbsd:uiomove+0x5d
ubc_uiomove(c7d64c80,c7d9f98c,4c8,0,101) at netbsd:ubc_uiomove+0xeb
ffs_read(c7d9f948,10001,0,4,0) at netbsd:ffs_read+0x46b
VOP_READ(c7d64c80,c7d9f98c,10,c5d83f00,ffffffff) at netbsd:VOP_READ+0x31
vn_rdwr(0,c7d64c80,8063000,4c8,1b000) at netbsd:vn_rdwr+0xa5
vmcmd_readvn(c7f6be00,c0d17e1c,bfc00000,c7d9fbe8,8) at netbsd:vmcmd_readvn+0x65
execve1(c7f6be00,bba83a30,bfbfe704,bfbfeeec,c040cdb0) at netbsd:execve1+0x67d
sys_execve(c7f6be00,c7d9fc48,c7d9fc68,c7d9fce0,bba990c0) at netbsd:sys_execve+0x31
syscall_plain() at netbsd:syscall_plain+0x146
--- syscall (number 59) ---
0xbb9df907:
db> show registers
ds 0x11
es 0x11
fs 0x31
gs 0x11
edi 0
esi 0
ebp 0xc7d9f6c8
ebx 0xc7d9f64c
edx 0xc8001f40
ecx 0xc7f6be00
eax 0
eip 0xc04da68d mutex_enter+0xd
cs 0x9
eflags 0x10246
esp 0xc7d9f59c
ss 0x11
netbsd:mutex_enter+0xd: cmpxchgl %ecx,0(%edx)
I'll try to narrow down the date some more.
--
Manuel Bouyer, LIP6, Universite Paris VI. Manuel.Bouyer@lip6.fr
NetBSD: 26 ans d'experience feront toujours la difference
--