Subject: kern/14640: kernel hangs in syncing disk
To: None <gnats-bugs@gnats.netbsd.org>
From: None <mrauch@netbsdorg.fs.tum.de>
List: netbsd-bugs
Date: 11/19/2001 13:07:46
>Number: 14640
>Category: kern
>Synopsis: kernel hangs in syncing disks
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Nov 19 05:08:00 PST 2001
>Closed-Date:
>Last-Modified:
>Originator: Michael Rauch
>Release: 1.5Y (2001/11/18)
>Organization:
>Environment:
NetBSD i386, syssrc cvs update'd at 2001/11/18 about 12:00 GMT,
custom kernel (mainly GENERIC with unneeded drivers commented out)
>Description:
The kernel hangs in a loop it won't exit after heavy disk i/o.
Invoking ddb is still possible (and switching virtual consoles),
it hangs in function sched_sync (sys/miscfs/syncfs/sync_subr.c)
in the first while loop (starting line 185 in rev. 1.10), executing
the following functions over and over:
,>sched_sync
| `-> vn_lock
`-> VOP_LOCK
`-> genfs_lock
`-> lockmgr
<--'
<--'
<--'
<--'
`-> VOP_FSYNC
`-> genfs_fsync
`-> vflushbuf
<--'
`-> VOP_UPDATE
`-> ext2fs_update
<--'
<--'
<--'
<--'
`-> VOP_UNLOCK
`-> genfs_unlock
`-> lockmgr
^ <--'
| <--'
`-----'
mount:
/dev/wd0a on / type ffs (local)
/dev/wd0e on /usr type ffs (local)
/dev/wd0f on /windows type msdos (local)
/dev/wd0g on /usr/src type ext2fs (local)
mfs:118 on /tmp type mfs (asynchronous, local)
The heavy disk i/o was on the /usr/src partition (ext2fs filesystem).
Trying to `sync` from within ddb I get
panic: lockmgr: locking against myself
drop back into ddb and another `sync` reboots the machine.
Slight disk corruption can occur, although mostly fsck reports
no errors on the disk.
This problem was also found by others, see
http://mail-index.netbsd.org/current-users/2001/11/14/0006.html
for the start of the thread.
>How-To-Repeat:
Do operations which require a lot of disk i/o. See the system suddenly
lock.
>Fix:
n/a
>Release-Note:
>Audit-Trail:
>Unformatted: