tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Anyone recall the dreaded tstile issue?
I've been exchanging email off-list about this with a few people. One
of them remarked that a kernel coredump would help.
Yesterday it wedged again. I got a kernel coredump...and, well, as I
put it in off-list mail:
>> I now realize I don't know how to coax [process stack traces] out of
>> a kernel core. I don't recall hearing of any sort of postmortem
>> ddb. I have a the corresponding netbsd.gdb, and I found gdb's
>> target kvm, but I haven't manged to get a stack trace for any
>> process out of it.
The response turned out to be exactly the cluesticking I needed to get
stack traces.
I've now got (kernel) stack traces. They explain very neatly how
unrelated processes end up in puffsrpl - it's the vnode version of the
memory-pressure theory I mentioned (as implausible) upthread:
#0 0xc04b7beb in mi_switch ()
#1 0xc04b3dbb in sleepq_block ()
#2 0xc048eb0f in cv_wait_sig ()
#3 0xc038b3ea in puffs_msg_wait ()
#4 0xc038b547 in puffs_msg_wait2 ()
#5 0xc038ff40 in puffs_vnop_inactive ()
#6 0xc05281f8 in VOP_INACTIVE ()
#7 0xc051b7bc in vclean ()
#8 0xc051d36a in getcleanvnode ()
#9 0xc051d52e in getnewvnode ()
#10 0xc0404aa3 in ffs_vget ()
#11 0xc03f3a45 in ffs_valloc ()
#12 0xc042f052 in ufs_makeinode ()
#13 0xc04309fa in ufs_create ()
#14 0xc05290af in VOP_CREATE ()
#15 0xc0525df2 in vn_open ()
#16 0xc0521d44 in sys_open ()
#17 0xc05a9fcf in syscall ()
#18 0xc010058e in syscall1 ()
#0 0xc04b7beb in mi_switch ()
#1 0xc04b3dbb in sleepq_block ()
#2 0xc048eb0f in cv_wait_sig ()
#3 0xc038b3ea in puffs_msg_wait ()
#4 0xc038b547 in puffs_msg_wait2 ()
#5 0xc038ff40 in puffs_vnop_inactive ()
#6 0xc05281f8 in VOP_INACTIVE ()
#7 0xc051b7bc in vclean ()
#8 0xc051d36a in getcleanvnode ()
#9 0xc051d52e in getnewvnode ()
#10 0xc0404aa3 in ffs_vget ()
#11 0xc03f3a45 in ffs_valloc ()
#12 0xc042f052 in ufs_makeinode ()
#13 0xc04309fa in ufs_create ()
#14 0xc05290af in VOP_CREATE ()
#15 0xc0525df2 in vn_open ()
#16 0xc0521d44 in sys_open ()
#17 0xc05a9fcf in syscall ()
#18 0xc010058e in syscall1 ()
#0 0xc04b7beb in mi_switch ()
#1 0xc04b3dbb in sleepq_block ()
#2 0xc048eb0f in cv_wait_sig ()
#3 0xc038b3ea in puffs_msg_wait ()
#4 0xc038b547 in puffs_msg_wait2 ()
#5 0xc038ff40 in puffs_vnop_inactive ()
#6 0xc05281f8 in VOP_INACTIVE ()
#7 0xc051b7bc in vclean ()
#8 0xc051d36a in getcleanvnode ()
#9 0xc051d52e in getnewvnode ()
#10 0xc0404aa3 in ffs_vget ()
#11 0xc042d59b in ufs_lookup ()
#12 0xc052917c in VOP_LOOKUP ()
#13 0xc0516ddb in lookup ()
#14 0xc05175c5 in namei ()
#15 0xc05205a6 in sys_access ()
#16 0xc05a9fcf in syscall ()
#17 0xc010058e in syscall1 ()
(Arguments are not shown because I made a stupid mistake; I did not
have a netbsd.gdb available. But the above traces are informative
enough, to me.)
There was a git process, it was a child of the main gitfs process, and
it was in puffsrpl (it's the last of the above stack traces).
So my best-guess theory now is that I have a codepath somewhere in
gitfs that forks git and waits for it to finish _without_ processing
other puffs requests while waiting. There should be no such, but I
can't explain this any other way. The gitfs process is blocked in
select, but that's exactly what I'd expect.
I now would like _userland_ stack traces. The kernel stack trace for
the main gitfs process is exactly what I'd expect
#0 0xc04b7beb in mi_switch ()
#1 0xc04b3dbb in sleepq_block ()
#2 0xc04e60ed in pollcommon ()
#3 0xc04e639f in sys_poll ()
#4 0xc05a9fcf in syscall ()
#5 0xc010058e in syscall1 ()
but waiting for git to finish could very well be in poll() waiting for
git to print output.
/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML mouse%rodents-montreal.org@localhost
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Home |
Main Index |
Thread Index |
Old Index