Subject: Re: kern/22080: switching with held simple_lock
To: NetBSD GNATS submissions and followups <gnats-bugs@gnats.netbsd.org>
From: Greg A. Woods <woods@weird.com>
List: netbsd-bugs
Date: 08/25/2003 19:03:08
I finally (my alpha was powered off during the mini Ontario energy
crisis last week :-) got a custom kernel built with both MULTIPROCESSING
and LOCKDEBUG options.
I had been doing a wee bit of compiling and messing around with it using
the new kernel and while manually copying some files from NFS to local
disk it just hung, rock solid.
I think this may relate to this PR#22080, though given what I've been
reading about LOCKDEBUG I find it surprising that there was no error
message printed on the console....
(I still haven't looked into why it doesn't see BREAK on the serial
console, but luckily you can halt it from the RCM.)
Aug 25 18:36:54 building su: woods to root on /dev/ttyp0
[halt sent]
RCM>status
Firmware Rev: V1.1
Escape Sequence: ^]^]RCM
Remote Access: DISABLE
Alerts: DISABLE
Alert Pending: NO
Temp (C): 37.0
RCM Power Control: ON
External Power: OFF
Server Power: ON
RCM>halt
Focus returned to COM port
halted CPU 0
CPU 1 is not halted
halt code = 1
operator initiated halt
PC = fffffc000044f51c
P00>>>cont
continuing CPU 0
CP - RESTORE_TERM routine to be called
panic: user requested console halt
Stopped in pid 10838 (sh) at cpu_Debugger+0x4: ret zero,(ra)
db{0}> trace
cpu_Debugger() at cpu_Debugger+0x4
panic() at panic+0x168
console_restart() at console_restart+0x74
XentRestart() at XentRestart+0x90
--- console restart (from ipl 4) ---
_simple_lock() at _simple_lock+0x15c
wakeup() at wakeup+0xcc
schedcpu() at schedcpu+0x34c
softclock() at softclock+0x2b4
hardclock() at hardclock+0x7c0
interrupt() at interrupt+0x180
XentInt() at XentInt+0x1c
--- interrupt (from ipl 0) ---
pmap_enter() at pmap_enter+0xb08
uvm_fault() at uvm_fault+0x2020
uvm_fault_wire() at uvm_fault_wire+0x74
uvm_fork() at uvm_fork+0x98
fork1() at fork1+0x548
sys_fork() at sys_fork+0x38
syscall_plain() at syscall_plain+0x164
XentSys() at XentSys+0x5c
--- syscall (2) ---
--- user mode ---
db{0}> ps
PID PPID PGRP UID S FLAGS COMMAND WAIT
>10838 10837 10837 0 7 0x84006 sh
10837 10720 10837 0 3 0x84086 nbmake piperd
10720 251 10720 0 3 0x84086 ksh pause
686 676 686 1000 3 0x84086 ksh ttyin
676 240 676 0 3 0x84084 rlogind select
251 249 251 1000 3 0x84086 ksh pause
249 240 249 0 3 0x84084 rlogind select
248 1 248 0 3 0x84086 getty ttyin
246 1 246 0 3 0x80084 cron nanosle
240 1 240 0 3 0x80084 inetd select
211 1 211 0 3 0x80084 ntpd pause
176 1 163 0 3 0x80084 nfsd nfsd
175 1 163 0 3 0x80084 nfsd nfsd
174 1 163 0 3 0x80084 nfsd nfsd
173 1 163 0 3 0x80084 nfsd nfsd
135 0 0 0 3 0xa0284 nfsio nfsidl
134 0 0 0 3 0xa0284 nfsio nfsidl
133 0 0 0 3 0xa0284 nfsio nfsidl
132 0 0 0 3 0xa0284 nfsio nfsidl
130 1 130 0 3 0x80084 mount_mfs mfsidl
121 1 121 0 3 0x80084 rpcbind select
106 1 106 0 3 0x80084 ipmon nanosle
99 1 99 0 3 0x80084 syslogd select
9 0 0 0 3 0xa0204 aiodoned aiodone
8 0 0 0 3 0xa0204 ioflush syncer
7 0 0 0 3 0x20204 reaper reaper
6 0 0 0 3 0xa0204 pagedaemon pgdaemo
5 0 0 0 3 0xa0204 pms0 pmsrese
4 0 0 0 3 0xa0204 mlxtask mlxzzz
3 0 0 0 3 0xa0204 scsibus1 sccomp
2 0 0 0 3 0xa0204 scsibus0 sccomp
1 0 1 0 3 0x84084 init wait
0 -1 0 0 2 0xa0204 swapper
db{0}> cont
syncing disks... [halt sent]
RCM>halt
Focus returned to COM port
CP - RESTORE_TERM exited with hlt_req = 0, r0 = 00000007.00000000
halted CPU 0
CPU 1 is not halted
halt code = 0
PC = fffffc000044f51c
P00>>>cont
Slot context is not valid
P00>>>init
Initializing...
--
Greg A. Woods
+1 416 218-0098 VE3TCP RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com> Secrets of the Weird <woods@weird.com>