NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/42228: Kernel deadlock prevents further file system access
>Number: 42228
>Category: kern
>Synopsis: Kernel deadlock prevents further file system access
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Oct 25 12:10:00 +0000 2009
>Originator: Matthias Scheler
>Release: NetBSD 5.0_STABLE sources from 2009-10-11
>Organization:
Matthias Scheler http://zhadum.org.uk/
>Environment:
System: NetBSD colwyn.zhadum.org.uk 5.0_STABLE NetBSD 5.0_STABLE (COLWYN.64)
#1: Sun Oct 11 21:22:04 BST 2009
tron%colwyn.zhadum.org.uk@localhost:/src/sys/compile/COLWYN.64 amd64
Architecture: x86_64
Machine: amd64
>Description:
My NetBSD/amd64 5.0/5.0_STABLE server has locked up at multiple occasions.
The symptoms are always identical:
1.) The system still responds to ICMP Echo packets.
2.) Services that don't access the file system (frequently) like e.g.
BIND still work.
3.) Any other processes (Postfix, Apache, inetd, etc.) are stuck.
I have to recover the machine by dropping into "ddb" and using "sync"
or "reboot" to force a restart.
The system managed to write a crash dump (to "raid0b" in case it
matters) after several of the crashes but "savecore" always claims
there is no crash dump once the system is back up. The only debug
information I therefore have is the stack trace of multiple of
the hung process after the latest incident. The stack traces all
look like this:
sleepq_block
turnstile_block
rw_vector_enter
vlockmgr
VOP_LOCK
vn_lock
namei
The function after "namei" differs from case to case. Examples are
vn_open() or do_sys_stat().
If someone sends me instructions for producing more useful debugging
I will of course provide it.
>How-To-Repeat:
I'm not exactly sure what triggers it. Contributing factors might be:
- "/tmp" on tmpfs
- amd(8)
I use amd(8) for managing "/home" and three more top level directory
hierarchies. The configuration looks like this:
[global]
auto_attrcache = 1
search_path = /etc/amd
unmount_on_exit = yes
[ /home ]
map_name = amd.home
map_type = file
[ /share ]
map_name = amd.share
map_type = file
[ /scratch ]
map_name = amd.scratch
map_type = file
[ /volumes ]
map_name = volumes
map_type = file
- rTorrent
rTorrent mmap()-based I/O handling caused problems with WAPBL in the
past, mostly when rTorrent was downloading files. During most of my
lock ups it was however only seeding.
- WAPBL
That is unlikely because reducing the number of file system which use
WAPBL didn't help. The only remaining file systems with WAPBL turned
on should have been idle during the last lock up.
, running rTorrent and/or WAPBL might
be one of the contributing factors. Using amd(8) could also be a reason.
>Fix:
Not known.
Home |
Main Index |
Thread Index |
Old Index