Subject: kern/21039: panic: ffs_alloccg: map corrupted after UFS2 upgrade
To: None <gnats-bugs@gnats.netbsd.org>
From: None <stephenm@employees.org>
List: netbsd-bugs
Date: 04/06/2003 07:12:55
>Number: 21039
>Category: kern
>Synopsis: panic: ffs_alloccg: map corrupted after UFS2 upgrade
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Apr 06 08:22:01 PDT 2003
>Closed-Date:
>Last-Modified:
>Originator: Stephen Ma
>Release: NetBSD 1.6Q 2003-04-02
>Organization:
People's Front for the correct spelling of the word "Organisation"
>Environment:
System: NetBSD whitewater.local 1.6Q NetBSD 1.6Q (WHITEWATER) #7: Wed Apr 2 18:48:35 PST 2003 stephenm@whitewater.local:/v1/netbsd/obj/src/sys/arch/i386/compile/WHITEWATER i386
Architecture: i386
Machine: i386
>Description:
The kernel panics with the message "panic: ffs_alloccg: map corrupted"
soon after booting a kernel with the new UFS2 support included. This
happens when writing to a partition that works fine with a kernel
built prior to the UFS2 support (around 2003-03-27). The partition is
at 80% capacity, and has been happily in-use (including many full NetBSD
release builds) with the pre-UFS2 kernel for a longish time. Softdep is
enabled on the partition.
The panic seems to happen on the first write (or possibly the first
inode allocation) on that partition after booting with the UFS2
enabled kernel. However, the same UFS2 enabled kernel doesn't seem to
panic when writing to other (smaller) partitions on the same box, so
the panic seems to be sensitive to whatever partition it's accessing.
The panic seems to be reliably reproducible - it's happened several
times, and always seemingly on the first write to the partition after
a reboot.
The partition is on an IDE drive that probes as:
pciide0 at pci0 dev 7 function 1: Intel 82371AB IDE controller (PIIX4) (rev. 0x01)
pciide0: bus-master DMA support present
pciide0: primary channel wired to compatibility mode
wd0 at pciide0 channel 0 drive 0: <TOSHIBA MK1214GAP>
wd0: drive supports 16-sector PIO transfers, LBA addressing
wd0: 11513 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 23579136 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4 (Ultra/66)
pciide0: primary channel interrupting at irq 14
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA data
transfers)
pciide0: secondary channel wired to compatibility mode
A transcript of the panic is included below. A full copy of the dumpfs
output is available on request.
>How-To-Repeat:
23:18:21 whitewater:tmp# mount
/dev/wd0a on / type ffs (noatime, nodevmtime, local)
/dev/wd0e on /usr type ffs (noatime, soft dependencies, local)
/dev/wd0f on /v1 type ffs (noatime, soft dependencies, local)
mfs:187 on /tmp type mfs (synchronous, local)
23:18:25 whitewater:tmp# dumpfs /v1 >v1.dump
23:18:31 whitewater:tmp# head -22 v1.dump
file system: /dev/rwd0f
endian little-endian
magic 11954 time Fri Apr 4 21:00:14 2003
id [ 0 0 ]
cylgrp dynamic inodes 4.4BSD fslevel 3 softdep disabled
nbfree 18649 ndir 28503 nifree 303643 nffree 20571
ncg 6 ncyl 392 size 740880 blocks 725607
bsize 32768 shift 15 mask 0xffff8000
fsize 4096 shift 12 mask 0xfffff000
frag 8 shift 3 fsbtodb 3
cpg 76 bpg 17955 fpg 143640 ipg 80896
minfree 5% optim time maxcontig 2 maxbpg 8192
rotdelay 0ms rps 60
ntrak 240 nsect 63 npsect 63 spc 15120
symlinklen 60 trackskew 0 interleave 1 contigsumsize 2
maxfilesize 0x004002001005ffff
nindir 8192 inopb 256 nspf 8
avgfilesize 16384 avgfpdir 64
sblkno 8 cblkno 16 iblkno 24 dblkno 2552
sbsize 4096 cgsize 32768 offset 8 mask 0xffffff00
csaddr 2552 cssize 4096 shift 11 mask 0xfffff800
cgrotor 0 fmod 0 ronly 0 clean 0x02
23:18:38 whitewater:tmp# cd /v1
23:18:41 whitewater:/v1# ls -tlr
total 48
drwxr-xr-x 3 root wheel 512 Nov 19 2000 export
drwxr-xr-x 2 root wheel 512 Jul 16 2002 tmp
drwxr-xr-x 7 root wheel 512 Nov 30 02:47 netbsd
drwx-----T 2 root wheel 33280 Apr 2 19:23 lost+found
23:18:48 whitewater:/v1# cp /usr/bin/vi .
start = 1, len = 17954, fs = /v1
offset=10736 10736
panic: ffs_alloccg: map corrupted
Stopped in pid 597.1 (cp) at cpu_Debugger+0x4: leave
db> show registers
ds 0x10
es 0x10
fs 0x30
gs 0x10
edi 0xc02aabce fifo_nfsv2nodeop_opv_desc+0x72e
esi 0x100
ebp 0xcf546920 end+0xf212478
ebx 0xcf54694c end+0xf2124a4
edx 0
ecx 0x16e084
eax 0x18e1
eip 0xc021e1c8 cpu_Debugger+0x4
cs 0x8
eflags 0x202
esp 0xcf546920 end+0xf212478
ss 0x10
cpu_Debugger+0x4: leave
db> bt
cpu_Debugger(2,0,4622,c0179f1e,c02aabbf) at cpu_Debugger+0x4
panic(c02aabce,1,4622,c07c60d4,8) at panic+0xb8
ffs_mapsearch(c07c6000,ccf82000,8,0,8) at ffs_mapsearch+0x132
ffs_alloccgblk(cf5378f4,c4838120,8,0,0) at ffs_alloccgblk+0xb8
ffs_alloccg(cf5378f4,0,8,0,8000) at ffs_alloccg+0x12b
ffs_hashalloc(cf5378f4,0,8,0,8000) at ffs_hashalloc+0x2e
ffs_alloc(cf5378f4,0,0,8,0) at ffs_alloc+0x1bf
ffs_balloc_ufs1(cf546c90,cf2cf000,cf546c78,2,cf546ddc) at ffs_balloc_ufs1+0x68b
ffs_balloc(cf546c90,2,cf520b08,cf54a0a0,0) at ffs_balloc+0x2a
VOP_BALLOC(cf54a0a0,0,0,8000,c079b800) at VOP_BALLOC+0x4f
ufs_gop_alloc(cf54a0a0,0,0,8000,0) at ufs_gop_alloc+0xab
ffs_write(cf546e4c,30002,cf507b68,0,4213c) at ffs_write+0x5ec
VOP_WRITE(cf54a0a0,cf546ee0,1,c079b800,cf546f80) at VOP_WRITE+0x3b
vn_write(cf2d3620,cf2d3648,cf546ee0,c079b800,1) at vn_write+0x9f
dofilewrite(cf507b68,7,cf2d3620,48106000,4213c) at dofilewrite+0x87
sys_write(cf29ec80,cf546f80,cf546f78,c02285a4,0) at sys_write+0x6b
syscall_plain(1f,1f,1f,1f,805f150) at syscall_plain+0xab
db> ps
PID PPID PGRP UID S FLAGS LWPS COMMAND WAIT
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:
>597 542 597 0 2 0x4002 1 cp
542 529 542 0 2 0x4002 1 bash wait
529 1 529 1000 2 0x4003 1 bash wait
534 378 534 0 2 0x4002 1 bash ttyin
377 1 377 0 2 0x4002 1 getty ttyin
378 1 378 1000 2 0x4003 1 bash wait
381 1 381 0 2 0 1 cron nanosle
348 1 348 0 2 0x20000 1 inetd kqread
311 1 311 0 2 0 1 sshd select
187 1 187 0 2 0 1 mount_mfs mfsidl
152 1 152 0 2 0 1 syslogd
12 0 0 0 2 0x20200 1 aiodoned aiodone
11 0 0 0 2 0x20200 1 ioflush syncer
10 0 0 0 2 0x20200 1 reaper reaper
9 0 0 0 2 0x20200 1 pagedaemon pgdaemo
8 0 0 0 2 0x20200 1 pcic0,0,1 pcicev
7 0 0 0 2 0x20200 1 pcic0,0,0 pcicev
6 0 0 0 2 0x20200 1 pms0 pmsrese
5 0 0 0 2 0x20200 1 usbtask usbtsk
4 0 0 0 2 0x20200 1 usb0 usbevt
3 0 0 0 2 0x20200 1 atapibus0 sccomp
2 0 0 0 2 0x20200 1 acpi sched acpisch
1 0 1 0 2 0x4000 1 init wait
0 -1 0 0 2 0x20200 1 swapper schedul
db> reboot
syncing disks... 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 giving up
panic: wdc_exec_command: polled command not done
Stopped in pid 597.1 (cp) at cpu_Debugger+0x4: leave
db> reboot
rebooting...