Subject: kern/27023: kernel crashes in LWP code from userland
To: None <gnats-bugs@gnats.NetBSD.org>
From: None <reinoud@netbsd.org>
List: netbsd-bugs
Date: 09/24/2004 13:45:37
>Number: 27023
>Category: kern
>Synopsis: kernel crashes in LWP code from userland
>Confidential: yes
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Sep 24 11:46:00 UTC 2004
>Closed-Date:
>Last-Modified:
>Originator: Reinoud Zandijk
>Release: NetBSD 2.0G
>Organization:
NetBSD
>Environment:
- NetBSD/sparc 2-0 installation (200409050000)
- NetBSD/sparc kernel 2.0G
- Sun SPARCclassic
- mysql-server/client 3.x
- dspam with mysql driver
System: NetBSD rangerover 2.0G NetBSD 2.0G (GENERIC) #0: Thu Aug 26 12:02:22 CEST 2004 imago@rangerover:/usr/sources/cvs.netbsd.org/src/sys/arch/sparc/compile/GENERIC sparc
Architecture: sparc
Machine: sparc
>Description:
When running dspam tools, it connects to the mysql server trough the mysql
socket and ran fine. At a time i used a second process but since it had
connection problems to the server, i temporary suspended the dspam process
for it might be that it was consuming all the precious little
processortime. The connection didnt come. I then resumed the dspam
processes but it seemed stuck again.
It turned out that a mysqld process was waiting on `sigwait' (according to
`top'). Thinking that it might be `convinced' to run again i signaled it a
`CONT' signal. That didn't help. Then i decided to signal it a `TERM' (9)
to get rid of it and that panicked the machine.
the last `top' i saw was had the entry :
429 mysql -18 4 18M 8824K anonget2 124:13 0.00% 0.00% <mysqld>
the last the kernel remembered :
Sep 24 04:03:38 rangerover syslogd: restart
Sep 24 04:03:38 rangerover /netbsd: data fault: pc=0xf018deb4 addr=0x24 sfsr=326 <PERR=0,LVL=3,AT=1,FT=1,FAV,OW>
Sep 24 04:03:38 rangerover /netbsd: panic: kernel fault
Sep 24 04:03:39 rangerover /netbsd: syncing disks... stopping on keyboard abort
Sep 24 04:03:39 rangerover /netbsd: panic: PROM sync command
Sep 24 04:03:39 rangerover /netbsd: Frame pointer is at 0xf0326000
Sep 24 04:03:39 rangerover /netbsd: Call traceback:
.... depending on the installation it looks like a non privilidged can
cause this abort.
dissassemble :
Dump of assembler code for function lwp_continue:
0xf018de7c <lwp_continue>: save %sp, -104, %sp
0xf018de80 <lwp_continue+4>: sethi %hi(0xf0331800), %o0
0xf018de84 <lwp_continue+8>:
ld [ %o0 + 0x304 ], %o1 ! 0xf0331b04 <lwp_debug>
0xf018de88 <lwp_continue+12>: cmp %o1, 0
0xf018de8c <lwp_continue+16>: be 0xf018deb4 <lwp_continue+56>
0xf018de90 <lwp_continue+20>: sethi %hi(0xf02e6c00), %o0
0xf018de94 <lwp_continue+24>: ld [ %i0 + 0x10 ], %o3
0xf018de98 <lwp_continue+28>: ld [ %o3 + 0x34 ], %o1
0xf018de9c <lwp_continue+32>: or %o0, 0x3a0, %o0
0xf018dea0 <lwp_continue+36>: ld [ %i0 + 0x28 ], %o2
0xf018dea4 <lwp_continue+40>: add %o3, 0x159, %o3
0xf018dea8 <lwp_continue+44>: ld [ %i0 + 0x24 ], %o4
0xf018deac <lwp_continue+48>: call 0xf01ae280 <printf>
0xf018deb0 <lwp_continue+52>: ld [ %i0 + 0x34 ], %o5
0xf018deb4 <lwp_continue+56>: ld [ %i0 + 0x24 ], %o0
0xf018deb8 <lwp_continue+60>: cmp %o0, 8
0xf018debc <lwp_continue+64>: bne 0xf018dee4 <lwp_continue+104>
0xf018dec0 <lwp_continue+68>: nop
0xf018dec4 <lwp_continue+72>: ld [ %i0 + 0x34 ], %o0
0xf018dec8 <lwp_continue+76>: cmp %o0, 0
0xf018decc <lwp_continue+80>: bne 0xf018dee0 <lwp_continue+100>
>How-To-Repeat:
follow the instructions above
>Fix:
Update or downgrade kernel? maybe its not apparent on 2-0 release?
>Release-Note:
>Audit-Trail:
>Unformatted: