Subject: Re: xen 3.1 problem (Re: xen 3.1.0 is there)
To: Kazushi Marukawa <jam@pobox.com>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: port-xen
Date: 06/19/2007 20:53:18
On Wed, Jun 20, 2007 at 12:17:45AM +0900, Kazushi Marukawa wrote:
>    On Jun 19, 14:38, Manuel Bouyer wrote:
>    > Subject: Re: xen 3.1 problem (Re: xen 3.1.0 is there)
>    >
>    > This looks a lot like the bug I fixed in dom0 support recently. You need a
>    > very recent current or 4.0 kernel for xen3.1  dom0 (newer than
>    > 2007/06/13 10:38:44, which is the time when the bug fix was pulled up to
>    > netbsd-4).
> 
> Thanks.  I was confused which is the current kernel while working between
> two machines.  My bad.  When I used correctly recent 2007/06/15 kernel,
> NetBSD DomU works good with xen3.1.
> 
> 
> However, I'm having read_psl problem now.  hehe.  It is very similar
> problem I reported at May 26th.  At that time, I just rollbacked my
> netbsd to Apr 30th.  This time, I need the recent kernel for Xen 3.1,
> so working to find the source of problem.
> 
> Last time, I was having problem when I try to run WinXP.  However,
> this time I'm having from both Dom0 while trying to run WinXP and NetBSD
> DomU while operating DomU.  Here is a trace from db.
> 
> 
> panic(c0413ef4,c03dc1ab,c03ed550,c040f720,88d) at netbsd:panic+0x155
> __assert(c03dc1ab,c040f720,88d,c03ed550,20) at netbsd:__assert+0x2e
> pmap_load(c02edf06,cba2cb14,bfbfe614,4,14) at netbsd:pmap_load+0x31b
> copyout(cb9bc7e0,cba2cc68,100,bfbfe614,bfbfe5f4) at netbsd:copyout+0xe
> sys_select(cb9bc7e0,cba2cc48,cba2cc68,80c8f80,11) at netbsd:sys_select+0x69
> syscall_plain() at netbsd:syscall_plain+0xb9
> --- syscall (number 93) ---
> 0xbbaf7f4f:
> db> reboot
> syncing disks... panic: kernel diagnostic assertion "read_psl() == 0" failed: fi
> le "/mnt/raid/netbsd/current/src/sys/arch/xen/i386/pmap.c", line 2189
> Stopped in pid 305.1 (screen-4.0.3) at  netbsd:cpu_Debugger+0x4:        popl
> %
> ebp
> db> 
> 
> Now, I'm trying to find the modified source codes causing this problem
> between 4/30 and 5/20.  I'll be back if I find something.

Is the first panic() also a "read_psl() == 0" one ?
This would mean something disabled interrupts between copyout() and
pmap_load() and failed to reenable them, but I didn't find anything obvious.
copyout() itself doesn't call pmap_load() so there's probably a trap in
between that isn't shown by ddb.

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--