Subject: kern/28709: ext2fs panic when generating large coredump
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <joff@embeddedARM.com>
List: netbsd-bugs
Date: 12/19/2004 05:21:00
>Number: 28709
>Category: kern
>Synopsis: ext2fs panic when generating large coredump
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Dec 19 05:21:00 +0000 2004
>Originator: Jesse Off
>Release: 2.99.11
>Organization:
Technologic Systems
>Environment:
not-yet-merged ARM evbarm-like port from 2.99.11
>Description:
An app generating a large coredump on an ext2fs filesystem as root gives the following panic:
panic: kernel diagnostic assertion "ovp->v_type != VREG || ovp->v_size == oip->i_size" failed: file "/home/joff/NetBSD-current/sys/ufs/ext2fs/ext2fs_inode.c", line 413
Stopped in pid 318.6 (nslookup) at netbsd:cpu_Debugger+0x4: bx r
14
db> bt
netbsd:panic+0x14
scp=0xc02b75b8 rlv=0xc0368afc (netbsd:__assert+0x34)
rsp=0xc2ebf4ac rfp=0xc2ebf4c0
r7=0x00000000 r6=0xc2e3f820
r5=0x0083b000 r4=0x0000019d
netbsd:__assert+0x10
scp=0xc0368ad8 rlv=0xc02555f8 (netbsd:ext2fs_truncate+0x210)
rsp=0xc2ebf4c4 rfp=0xc2ebf620
r4=0xc2ebf5f8
netbsd:ext2fs_truncate+0x10
scp=0xc02553f8 rlv=0xc0258664 (netbsd:ext2fs_write+0x4e8)
rsp=0xc2ebf624 rfp=0xc2ebf6f8
r10=0xc2e413fc r9=0xc2ebf740
r8=0x00000000 r7=0x00000000 r6=0x001e0000 r5=0x00000000
r4=0x0083b000
netbsd:ext2fs_write+0x10
scp=0xc025818c rlv=0xc02e5214 (netbsd:vn_rdwr+0xf0)
rsp=0xc2ebf6fc rfp=0xc2ebf788
r10=0x00000009 r9=0x00000001
r8=0x20c10000 r7=0x001e0000 r6=0xc2e413fc r5=0x00000001
r4=0xc23c4054
netbsd:vn_rdwr+0x10
scp=0xc02e5134 rlv=0xc0286abc (netbsd:coredump_writesegs_elf32+0x7c)
rsp=0xc2ebf78c rfp=0xc2ebf7d0
r10=0x00000000 r9=0x0083b000
r8=0x001e0000 r7=0xc23cee58 r6=0xc2ebf87c r5=0x00000000
r4=0xc23c4054
netbsd:coredump_writesegs_elf32+0x10
scp=0xc0286a50 rlv=0xc030717c (netbsd:uvm_coredump_walkmap+0x168)
rsp=0xc2ebf7d4 rfp=0xc2ebf820
r10=0xc23cee58 r9=0xc23c4054
r8=0xc2ebf874 r7=0xbfe00000 r6=0xc2c5a2d4 r5=0xc2c5a2a4
r4=0xc2c5a2a0
netbsd:uvm_coredump_walkmap+0x10
scp=0xc0307024 rlv=0xc02868e8 (netbsd:coredump_elf32+0x370)
rsp=0xc2ebf824 rfp=0xc2ebf900
r10=0xc23c4054 r9=0x00000000
r8=0xc2ebf880 r7=0xc23cee58 r6=0xc2c5c8b8 r5=0xc2ebf874
r4=0x00002000
netbsd:coredump_elf32+0x10
scp=0xc0286588 rlv=0xc02a6d00 (netbsd:coredump+0x340)
rsp=0xc2ebf904 rfp=0xc2ebfe38
r10=0xc23c4054 r9=0xc2c5c8b8
r8=0xc2ebf904 r7=0xc23cee58 r6=0xc2ebf95c r5=0xc2e413fc
r4=0xc2ebfd5c
netbsd:coredump+0x10
scp=0xc02a69d0 rlv=0xc02a6908 (netbsd:sigexit+0x80)
rsp=0xc2ebfe3c rfp=0xc2ebfe68
r10=0x00000005 r9=0xc2d2200c
r8=0xc2c5c8b8 r7=0x00000006 r6=0xc2c5c8b8 r5=0x00000006
r4=0xc23cee58
netbsd:sigexit+0x10
scp=0xc02a6898 rlv=0xc02a677c (netbsd:postsig+0x240)
rsp=0xc2ebfe6c rfp=0xc2ebfec8
r7=0x00000006 r6=0x00000000
r5=0xc23cee58 r4=0xc2c5c8b8
netbsd:postsig+0x10
scp=0xc02a654c rlv=0xc0325a84 (netbsd:userret+0x40)
rsp=0xc2ebfecc rfp=0xc2ebfee0
r10=0x00000004 r9=0x00000318
r8=0xc2c5c8b8 r7=0xc039a818 r6=0xc2ebffb4 r5=0xc23cee58
r4=0xc2c5c8b8
netbsd:userret+0x10
scp=0xc0325a54 rlv=0xc0327a20 (netbsd:syscall_plain+0x8c)
rsp=0xc2ebfee4 rfp=0xc2ebff60
r5=0x00000002 r4=0xefa00025
netbsd:syscall_plain+0x10
scp=0xc03279a4 rlv=0xc0327920 (netbsd:swi_handler+0x84)
rsp=0xc2ebff64 rfp=0xc2ebffb0
r10=0x2015947c r9=0x00000318
r8=0x00000001 r7=0xc041500c r6=0xc23cee58 r5=0x00000000
r4=0xc2c5c8b8
netbsd:swi_handler+0x10
scp=0xc03278ac rlv=0xc032ab5c (netbsd:swi_entry+0x28)
rsp=0xc2ebffb4 rfp=0x20dffe68
r8=0x000002ec r7=0x000002f0
r6=0x20c00000 r5=0x2015a968 r4=0x20dffa54
>How-To-Repeat:
Strangely enough, I get the problem by running "nslookup" and pthreads die with an assertion failure:
assertion "next != 0" failed: file "/home/joff/NetBSD-current/lib/libpthread/pthread_run.c", line 130, function "pthread__next"
The resulting core generates fine on an NFS filesystem, though is 4GB (and I assume sparse). A slightly unrelated note is that it only core-dumps on the first invocation of nslookup after bootup. Subsequent invocations work fine. I'm not entirely sure why this is, but I think its probably unrelated to this.
>Fix:
Unknown, hitting two issues simultaneously here. May be related to bugs in the ARM port I'm doing, though everything else seems to work fine.