Subject: Re: parallel make locking up (on amd64)
To: Martin Husemann <martin@duskware.de>
From: Kurt Schreiner <ks@ub.uni-mainz.de>
List: port-amd64
Date: 07/13/2006 13:59:25
On Wed, Jul 12, 2006 at 09:35:50PM +0200, Martin Husemann wrote:
> On Wed, Jul 12, 2006 at 04:49:44PM +0200, Kurt Schreiner wrote:
> > Is there anything I can do to help debugging this? Sendpr?
>
> Often a kernel with options LOCKDEBUG helps to debug this kind of problems.
Hm, I'm running a kernel w/ "options LOCKDEBUG", what I can do now?
The make's are hung again, but I get this hangs only when a union mount is
part of the game - softdeps or not doesn't make a difference, putting in
the union-mounted "local modifications" (which are empty, btw) reliably leads
to hung make's. Hm! So softdeps (what christos suggested) seem not to be the
culprit this time...
Here's the output from ddb:
login: ~Stopped at netbsd:cpu_Debugger+0x5: leave
db{0}> ps
PID PPID PGRP UID S FLAGS LWPS COMMAND WAIT
25375 23247 25375 77 2 0x4002 1 tcsh ttyin
23247 15946 15946 77 2 0x100 1 sshd select
15946 1085 15946 0 2 0x4101 1 sshd netio
684 1 466 77 2 0x4002 1 nbmake vnlock
436 1 3830 77 2 0x4002 1 nbmake vnlock
312 1143 312 77 2 0x4002 1 tcsh ttyin
1143 245 245 77 2 0x100 1 sshd select
245 1085 245 0 2 0x4101 1 sshd netio
244 1 244 0 2 0x4002 1 getty ttyin
243 1 243 0 2 0x4002 1 getty ttyin
242 1 242 0 2 0x4002 1 getty ttyin
241 1 241 0 2 0x4002 1 getty ttyin
236 1 236 0 2 0 1 cron nanosle
234 1 234 0 2 0 1 inetd kqread
1085 1 1085 0 2 0 1 sshd select
166 1 166 15 2 0x100 1 ntpd pause
98 925 925 0 2 0 1 nfsd nfsd
97 925 925 0 2 0 1 nfsd nfsd
96 925 925 0 2 0 1 nfsd nfsd
919 925 925 0 2 0 1 nfsd nfsd
925 1 925 0 2 0 1 nfsd poll
db{0}> bt/t 0t436
trace: pid 436 at 0xffff80006310e9f0
ltsleep() at netbsd:ltsleep+0x42a
acquire() at netbsd:acquire+0x25c
_lockmgr() at netbsd:_lockmgr+0xb05
VOP_LOCK() at netbsd:VOP_LOCK+0x28
vn_lock() at netbsd:vn_lock+0x97
union_lock() at netbsd:union_lock+0x7f
VOP_LOCK() at netbsd:VOP_LOCK+0x28
vn_lock() at netbsd:vn_lock+0x97
union_dircache() at netbsd:union_dircache+0x22
union_readdirhook() at netbsd:union_readdirhook+0x61
vn_readdir() at netbsd:vn_readdir+0x13b
sys___getdents30() at netbsd:sys___getdents30+0xec
syscall_plain() at netbsd:syscall_plain+0x122
kernel: page fault trap, code=0
Faulted in DDB; continuing...
db{0}> bt/t 0t684
trace: pid 684 at 0xffff800063152b50
ltsleep() at netbsd:ltsleep+0x42a
acquire() at netbsd:acquire+0x25c
_lockmgr() at netbsd:_lockmgr+0x8d7
VOP_LOCK() at netbsd:VOP_LOCK+0x28
vn_lock() at netbsd:vn_lock+0x97
union_lock() at netbsd:union_lock+0x7f
VOP_LOCK() at netbsd:VOP_LOCK+0x28
vn_lock() at netbsd:vn_lock+0x97
vn_readdir() at netbsd:vn_readdir+0xcb
sys___getdents30() at netbsd:sys___getdents30+0xec
syscall_plain() at netbsd:syscall_plain+0x122
kernel: page fault trap, code=0
Faulted in DDB; continuing...
db{0}>
Kurt