tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Anyone recall the dreaded tstile issue?



I've been exchanging email off-list about this with a few people.  One
of them remarked that a kernel coredump would help.

Yesterday it wedged again.  I got a kernel coredump...and, well, as I
put it in off-list mail:

>> I now realize I don't know how to coax [process stack traces] out of
>> a kernel core.  I don't recall hearing of any sort of postmortem
>> ddb.  I have a the corresponding netbsd.gdb, and I found gdb's
>> target kvm, but I haven't manged to get a stack trace for any
>> process out of it.

The response turned out to be exactly the cluesticking I needed to get
stack traces.

I've now got (kernel) stack traces.  They explain very neatly how
unrelated processes end up in puffsrpl - it's the vnode version of the
memory-pressure theory I mentioned (as implausible) upthread:

#0  0xc04b7beb in mi_switch ()
#1  0xc04b3dbb in sleepq_block ()
#2  0xc048eb0f in cv_wait_sig ()
#3  0xc038b3ea in puffs_msg_wait ()
#4  0xc038b547 in puffs_msg_wait2 ()
#5  0xc038ff40 in puffs_vnop_inactive ()
#6  0xc05281f8 in VOP_INACTIVE ()
#7  0xc051b7bc in vclean ()
#8  0xc051d36a in getcleanvnode ()
#9  0xc051d52e in getnewvnode ()
#10 0xc0404aa3 in ffs_vget ()
#11 0xc03f3a45 in ffs_valloc ()
#12 0xc042f052 in ufs_makeinode ()
#13 0xc04309fa in ufs_create ()
#14 0xc05290af in VOP_CREATE ()
#15 0xc0525df2 in vn_open ()
#16 0xc0521d44 in sys_open ()
#17 0xc05a9fcf in syscall ()
#18 0xc010058e in syscall1 ()

#0  0xc04b7beb in mi_switch ()
#1  0xc04b3dbb in sleepq_block ()
#2  0xc048eb0f in cv_wait_sig ()
#3  0xc038b3ea in puffs_msg_wait ()
#4  0xc038b547 in puffs_msg_wait2 ()
#5  0xc038ff40 in puffs_vnop_inactive ()
#6  0xc05281f8 in VOP_INACTIVE ()
#7  0xc051b7bc in vclean ()
#8  0xc051d36a in getcleanvnode ()
#9  0xc051d52e in getnewvnode ()
#10 0xc0404aa3 in ffs_vget ()
#11 0xc03f3a45 in ffs_valloc ()
#12 0xc042f052 in ufs_makeinode ()
#13 0xc04309fa in ufs_create ()
#14 0xc05290af in VOP_CREATE ()
#15 0xc0525df2 in vn_open ()
#16 0xc0521d44 in sys_open ()
#17 0xc05a9fcf in syscall ()
#18 0xc010058e in syscall1 ()

#0  0xc04b7beb in mi_switch ()
#1  0xc04b3dbb in sleepq_block ()
#2  0xc048eb0f in cv_wait_sig ()
#3  0xc038b3ea in puffs_msg_wait ()
#4  0xc038b547 in puffs_msg_wait2 ()
#5  0xc038ff40 in puffs_vnop_inactive ()
#6  0xc05281f8 in VOP_INACTIVE ()
#7  0xc051b7bc in vclean ()
#8  0xc051d36a in getcleanvnode ()
#9  0xc051d52e in getnewvnode ()
#10 0xc0404aa3 in ffs_vget ()
#11 0xc042d59b in ufs_lookup ()
#12 0xc052917c in VOP_LOOKUP ()
#13 0xc0516ddb in lookup ()
#14 0xc05175c5 in namei ()
#15 0xc05205a6 in sys_access ()
#16 0xc05a9fcf in syscall ()
#17 0xc010058e in syscall1 ()

(Arguments are not shown because I made a stupid mistake; I did not
have a netbsd.gdb available.  But the above traces are informative
enough, to me.)

There was a git process, it was a child of the main gitfs process, and
it was in puffsrpl (it's the last of the above stack traces).

So my best-guess theory now is that I have a codepath somewhere in
gitfs that forks git and waits for it to finish _without_ processing
other puffs requests while waiting.  There should be no such, but I
can't explain this any other way.  The gitfs process is blocked in
select, but that's exactly what I'd expect.

I now would like _userland_ stack traces.  The kernel stack trace for
the main gitfs process is exactly what I'd expect

#0  0xc04b7beb in mi_switch ()
#1  0xc04b3dbb in sleepq_block ()
#2  0xc04e60ed in pollcommon ()
#3  0xc04e639f in sys_poll ()
#4  0xc05a9fcf in syscall ()
#5  0xc010058e in syscall1 ()

but waiting for git to finish could very well be in poll() waiting for
git to print output.

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse%rodents-montreal.org@localhost
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Home | Main Index | Thread Index | Old Index