Subject: kern/29824: Xserver triggers threading problem
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <macallan18@earthlink.net>
List: netbsd-bugs
Date: 03/29/2005 12:35:00
>Number: 29824
>Category: kern
>Synopsis: Xserver triggers threading problem
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Mar 29 12:35:00 +0000 2005
>Originator: Michael Lorenz
>Release: -current with -current userland
>Organization:
>Environment:
NetBSD Inishowen 3.99.2 NetBSD 3.99.2 (INISHOWEN) #325: Mon Mar 28 08:38:50 EST 2005 root@Inishowen:/data/src/sys/arch/sparc64/compile/INISHOWEN sparc64
>Description:
I'm running XFree86 4.5 compiled from xsrc on a Sun Ultra 10. After a while the Xserver appears to lock up, usually when there's some IO / paging going on. Ssh and so on still work so I can still login and see what's going on - XFree86 just sits there consuming all CPU cycles it can get. Loading it into gdb gives this:
(gdb) bt
#0 0x0000000040a132a4 in pthread__lock_ras_end ()
from /usr/lib/libpthread.so.0
#1 0x0000000000000008 in ?? ()
#2 0x0000000040a134e8 in pthread_spinlock () from /usr/lib/libpthread.so.0
#3 0x0000000040a0bb38 in pthread_sigmask () from /usr/lib/libpthread.so.0
#4 0x000000000016c598 in xf86BlockSIGIO ()
#5 0x000000000014d650 in xf86SigioReadInput ()
#6 0x000000000016c248 in xf86SIGIO ()
#7 <signal handler called>
#8 0xffffaba100000000 in ?? ()
#9 0x0000000000000008 in ?? ()
#10 0x000000000016c5dc in xf86UnblockSIGIO ()
#11 0x000000000016c248 in xf86SIGIO ()
#12 <signal handler called>
#13 0xffffb1a100000000 in ?? ()
#14 0x0000000001347bb8 in ?? ()
#15 0xffffffffffffb16d in ?? ()
#16 0x000000000085915c in ?? ()
#17 0x00000000008598cc in ?? ()
#18 0x0000000000859c14 in ?? ()
#19 0x0000000000859dc8 in ?? ()
#20 0x00000000007a038c in ?? ()
#21 0x00000000007a20d8 in ?? ()
#22 0x00000000001ffd4c in miSpriteCopyArea ()
#23 0x000000000018a4a0 in ProcCopyArea ()
#24 0x0000000000187f78 in Dispatch ()
#25 0x000000000019b9f8 in main ()
#26 0x000000000012cb38 in ___start ()
... or something very similar. It's always in pthread_spinlock. The process can be killed and restarted ( just to lock up again pretty soon ) but apparently some kernel internals got screwed up, the machine doesn't reboot anymore - it shuts down the network and that's it.
side note: something similar happens on macppc, but apparently without affecting the kernel since my S900 ( also running 3.99.2 ) still reboots after killing a deadlocked Xserver.
>How-To-Repeat:
Compile XFree86 4.5 from xsrc, I used the onboard Rage 3D Pro of an U10 but I don't think that matters here, run some window manager and do some work. After a while the Xserver will lock up.
>Fix:
Stick to 4.4 for now, at least it doesn't have /this/ problem.