Subject: Re: process wedged in vnlock
To: None <current-users@netbsd.org>
From: Tom Spindler <dogcow@babymeat.com>
List: current-users
Date: 02/27/2007 15:22:10
Here's the results. (I had to boot into multiuser mode, which was kind
of annoying; didn't seem to work in singleuser.)
db> ps/l
PID LID S FLAGS STRUCT LWP * UAREA * WAIT
116 1 3 0x1000004 0xceff11c0 0xcd22ece0 vnlock
1054 1 3 0x84 0xceff1340 0xcd04ece0 pause
977 1 3 0x84 0xceff1ac0 0xcc3aece0 ttyin
855 1 3 0x84 0xcb1b3020 0xcc65ece0 ttyin
1039 1 3 0x84 0xcb1b34a0 0xcbe6ece0 wait
946 1 3 0x84 0xcb1b37a0 0xcbca8ce0 ttyin
1013 1 3 0x84 0xceff14c0 0xcce7ece0 nanoslp
1050 1 3 0x84 0xceff17c0 0xccc1ece0 select
1023 1 3 0x84 0xceff1640 0xccd0ece0 select
952 1 3 0x84 0xceff1940 0xcc82ece0 select
764 1 3 0x84 0xceff1dc0 0xcc74ece0 select
726 1 3 0x84 0xcb1b31a0 0xcc59ece0 pause
435 1 2 0x4 0xcb1b3320 0xcc4aece0
349 1 3 0x84 0xcb1b3620 0xcbcacce0 select
91 1 3 0x204 0xceff1c40 0xcc98ece0 physiod
14 1 3 0x204 0xcb1b3920 0xcbca4ce0 aiodoned
13 1 3 0x204 0xcb1b3aa0 0xcbca1ce0 syncer
12 1 3 0x204 0xcb1b3c20 0xcbc9ece0 pgdaemon
11 1 3 0x204 0xcb1b3da0 0xcbc99ce0 sccomp
10 1 3 0x204 0xcb1ac000 0xcbc93ce0 apmev
9 1 3 0x284 0xcb1ac180 0xcbc90ce0 fwprobe
db> t/a 0xceff11c0
trace: pid 116 lid 1 at 0xcd22e9fc
sleepq_switch(0,0,cd1b7c74,c0363964,0) at netbsd:sleepq_switch+0x53
ltsleep(cd1b7c74,14,c0363964,0,cd1b7c74) at netbsd:ltsleep+0x13b
acquire(0,40500,c019de46,0,74) at netbsd:acquire+0x104
_lockmgr(cd1b7c74,10002,cd1b7bec,c0386930,937) at netbsd:_lockmgr+0x9bb
ufs_lock(cd22eb60,cd22eb94,cf0b2334,cd22ebb8,0) at netbsd:ufs_lock+0x3a
VOP_LOCK(cd1b7bec,10002,2b7,0,5) at netbsd:VOP_LOCK+0x23
vn_lock(cd1b7bec,20002,200,0,cf0b3a88) at netbsd:vn_lock+0x96
vn_close(cd1b7bec,5,cf0b2334,ceff11c0,ceff11c0) at netbsd:vn_close+0x21
vn_closefile(cf0b3a88,ceff11c0,5b3,0,0) at netbsd:vn_closefile+0x1a
closef(cf0b3a88,ceff11c0,cd22ec68,cd22ece0,8054000) at netbsd:closef+0x17c
syscall_plain() at netbsd:syscall_plain+0xb4
--- syscall (number 6) ---
0xbbb191fb:
db> show lock ufs_hashlock
lock address : 0x00000000c03d5cd4 type : sleep/adaptive
shared holds : 0 exclusive: 0
shares wanted: 0 exclusive: 0
current cpu : 0 last held: 0
current lwp : 000000000000000000 last held: 000000000000000000
last locked : 0x00000000c01954da unlocked : 0x00000000c0195586
Turnstile chain at 0xc03dc200 with tc_mutex at 0xc03dc220.
=> No active turnstile for this lock.
db> x c01954da
netbsd:ffs_vget+0x96: 75ff026a
db> x c0195586
netbsd:ffs_vget+0x142: 8bd85d8b
On Tue, Feb 27, 2007 at 01:19:24PM -0800, Tom Spindler wrote:
> > Are you folks on uni-processor or multi-processor machines?
> > I've been trying to cause a dual-core box to fall over, and have been
> > unable to so far...
>
> Uniprocessor.
>
> > cvs rdiff -r1.4 -r1.5 src/sys/kern/kern_turnstile.c
>
> Nope.
>
> > If the trace includes ufs_ihash* it would be good to see the output of
> > "ps/l", and note if anything is sleeping on "tstile" if this happens again..
> > The output of "show lock ufs_hashlock" would be useful here. If it shows
> > that the lock is currently held, you can get a backtrace from the owner
> > using "t/a <address of lwp>".
>
> I'll try that when I get home.
>
> As it happens, I can immediately cause it to happen within about
> ten seconds by doing "find . -type d -print >/dev/null".
>