Subject: kern/20653: lfs_segwrite panic
To: None <gnats-bugs@gnats.netbsd.org>
From: None <scotte@warped.com>
List: netbsd-bugs
Date: 03/11/2003 02:57:07
>Number: 20653
>Category: kern
>Synopsis: LFS filesystem on -current causes panic as soon as lfs_cleanerd tries to clean it
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Mar 10 18:58:00 PST 2003
>Closed-Date:
>Last-Modified:
>Originator: Scott Ellis
>Release: NetBSD 1.6P
>Organization:
//////////////////////////////////////////////////////////////////////
// Scott Ellis // scotte@warped.com //
//////////////////////////////////////////////////////////////////////
// WARNING: This signature warps time and space in its vicinity //
>Environment:
System: NetBSD intrepid 1.6P NetBSD 1.6P (INTREPID.APM.DDB) #0: Mon Mar 10 17:45:21 PST 2003 scotte@intrepid:/usr/src/sys/arch/i386/compile/INTREPID.APM.DDB i386
Architecture: i386
Machine: i386
Userland and kernel are from the same day.
>Description:
Using the /dev/wd1f partition in the following table:
intrepid# df -i
Filesystem 1K-blocks Used Avail Capacity iused ifree %iused Mounted on
/dev/wd0a 3060024 2640320 266696 90% 55398 40856 57% /
mfs:37 97863 4 92965 0% 6 24760 0% /tmp
/dev/wd0e 992812 68280 874888 7% 7050 241780 2% /var
/dev/wd0f 112381288 89393216 17369000 83% 224627 3295371 6% /misc
/dev/wd1f 99693798 83534000 11175110 88% 187092 130520428 0% /mounts/tempmisc
Seemingly as soon as lfs_cleanerd starts running, the system panics with:
lfs_segwrite: possibly invalid checkpoint!
lfs_segwrite: ifile still has dirty blocks?!
bp=0xca29b6e0, lbn 51, flags 0x24080
bp=0xca2afe10, lbn 35, flags 0x24080
panic: dirty blocks
syncing disks...
This is repeatable:
lfs_segwrite: possibly invalid checkpoint!
lfs_segwrite: ifile still has dirty blocks?!
bp=0xca2de1d0, lbn 44, flags 0x24080
bp=0xca2dda50, lbn 51, flags 0x24080
panic: dirty blocks
Stopped in pid 492.1 (sync) at cpu_Debugger+0x4: leave
db> bt
cpu_Debugger(0,e38d5000,0,c017bae6,c027446e) at cpu_Debugger+0x4
panic(c027448c,e3c20000,e38d5000,212,c0f73000) at panic+0xb8
lfs_segwrite(c10f1600,5,c0406224,0,0) at lfs_segwrite+0x54b
lfs_sync(c10f1600,2,c1116000,e3be16a0,e3cd3488) at lfs_sync+0x74
sys_sync(e3cd3488,e3f79f80,e3f79f78,c021bf64,8049c60) at sys_sync+0x66
syscall_plain(1f,1f,1f,1f,bfbff6dc) at syscall_plain+0xab
db>
lfs_segwrite: possibly invalid checkpoint!
lfs_segwrite: ifile still has dirty blocks?!
bp=0xca2e0570, lbn 51, flags 0x24080
bp=0xca2e1470, lbn 35, flags 0x24080
panic: dirty blocks
Stopped in pid 11.1 (ioflush) at cpu_Debugger+0x4: leave
db> bt
cpu_Debugger(0,e3bf8000,0,c017bae6,c027446e) at cpu_Debugger+0x4
panic(c027448c,e3bfa000,e3bf8000,212,c10ef800) at panic+0xb8
lfs_segwrite(c1095a00,5,e38c7f68,c01e0e1d,0) at lfs_segwrite+0x54b
lfs_sync(c1095a00,3,c0e69f00,e38c31a8,e3f4c8f4) at lfs_sync+0x74
sync_fsync(e38c7f68,12,0,0,e389c500) at sync_fsync+0x5c
sched_sync(e389c500,e38b5500,0,0,c010030c) at sched_sync+0x11e
db>
fsck_lfs shows no problems with the partition. Note that the partition got
to its current state via a simple "pax -rw -pe" from /misc to /mounts/tempmisc,
immediately after newfs_lfs'ing and mounting it.
The system hangs after "syncing disks...", and never generates a dump file,
so I don't have an image for a thorough post-portem.
>How-To-Repeat:
Presumably, just make an LFS partition and start copying stuff to it. ;-)
>Fix:
I wish I knew. ;-) I'd really like to use LFS, but this seems to be
an un-recoverable problem here (since fsck doesn't fix it, and it happens
as soon as there's activity).
If there's any further info that is needed to debug this, let me know. It's
repeatable within seconds. ;-)
>Release-Note:
>Audit-Trail:
>Unformatted: