Subject: kern/36395: _fstrans_start panic while executing umount
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Bernd Ernesti <pr200703@veego.de>
List: netbsd-bugs
Date: 05/28/2007 19:55:00
>Number: 36395
>Category: kern
>Synopsis: _fstrans_start panic while executing umount
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon May 28 19:55:00 +0000 2007
>Originator: Bernd Ernesti
>Release: NetBSD 4.99.20
>Organization:
>Environment:
System: NetBSD 4.99.20
Architecture: i386
Machine: i386
>Description:
I got a panic while executing a sync with an imediately following umount -a
_fstrans_start with held simple_lock 0xc07fa1e4 CPU 1 /src/sys/kern/vfs_syscalls.c:630
uvm_fault(0xd03b8934, 0, 1) -> 0xe
kernel: supervisor trap page fault, code=0
Stopped in pid 18146.1 (umount) at netbsd:db_read_bytes+0x30: movl 0 (%esi),%eax
db{1}> bt
db_read_bytes(1,4,d6a1e7f4,e18dba1c,a1e7f8) at netbsd:db_read_bytes+0x30
db_get_value(1,4,0,72747366,5f736e61) at netbsd:db_get_value+0x27
db_stack_trace_print(d6a1e8e0,1,ffff,c07590b9,c03e1b20) at netbsd:db_stack_trace_print+0x527
simple_lock_only_held(0,c0681c71,0,c03e36da,0) at netbsd:simple_lock_only_held+0x104
_fstrans_start(c3cda000,1,1,0,0) at netbsd:_fstrans_start+0x24
ffs_fsync(d6a1e9e8,17665be,d6a1ea0c,c044fe00,d6a1e9f0) at netbsd:ffs_fsync+0x32
VOP_FSYNC(e18dba1c,ffffffff,5,0,0) at netbsd:VOP_FSYNC+0x49
vinvalbuf(e18dba1c,1,ffffffff,d0466240,0) at netbsd:vinvalbuf+0x18e
vclean(e18dba1c,1877330,1,c03e25b7,0) at netbsd:vclean+0x92
vgonel(e18dba1c,d0466240,108,c07fa200,5c5) at netbsd:vgonel+0x38
getcleanvnode(0,c07665be,214,d04d66c4,c42bf000) at netbsd:getcleanvnode+0xf6
getnewvnode(15,c42bf000,c35bab00,d6a1eb4c,c07f8974) at netbsd:getnewvnode+0xb0
vfs_allocate_syncvnode(c42bf000,c0766b03,276,d0466240,265) at netbsd:vfs_allocate_syncvnode+0x32
dounmount(c42bf000,0,d0466240,c42bf000,0) at netbsd:dounmount+0x3a5
sys_unmount(d0466240,d6a1ec48,d6a1ec68,804b008,804b000) at netbsd:sys_unmount+0x126
syscall_plain() at netbsd:syscall_plain+0x16a
--- syscall (number 22) ---
db{1}> ps/l
PID LID S FLAGS STRUCT LWP * UAREA * WAIT
>18146 > 1 7 0x20000004 0xd0466240 0xd6a1ece0
8159 1 3 0x84 0xd37820e0 0xd2582ce0 kqread
27727 1 3 0x84 0xd3459600 0xd37f2ce0 pause
10671 1 3 0x84 0xd3c1de00 0xd3cc7ce0 poll
1082 1 3 0x284 0xd0550740 0xd0f2ece0 nfsiod
740 1 3 0x284 0xd05508e0 0xd0daece0 nfsiod
1081 1 3 0x284 0xd0550a80 0xd0c8fce0 nfsiod
692 1 3 0x284 0xd0550dc0 0xd053ece0 nfsiod
798 1 3 0x80 0xd04663e0 0xd052bce0 ttyin
733 1 3 0x80 0xd0466580 0xd0528ce0 ttyin
801 1 3 0x80 0xd0466720 0xd0525ce0 ttyin
573 1 3 0x80 0xd03b53c0 0xd0445ce0 ttyin
623 1 3 0x80 0xcf338040 0xd005cce0 ttyin
794 1 3 0x84 0xcf3381e0 0xd0058ce0 nanoslp
759 1 3 0x84 0xd04668c0 0xd0522ce0 nanoslp
747 1 3 0x84 0xd0466a60 0xd04e6ce0 poll
766 1 3 0x84 0xd0466c00 0xd0448ce0 kqread
741 1 3 0x84 0xd0466da0 0xd04e9ce0 kqread
763 1 3 0x84 0xd03b5220 0xd040cce0 kqread
539 1 3 0x84 0xd03b5a40 0xd00a2ce0 select
488 1 3 0x84 0xd03b58a0 0xd0406ce0 pause
432 1 3 0x84 0xd03b5be0 0xd03fbce0 nfsd
434 1 3 0x84 0xd03b5d80 0xd03f8ce0 nfsd
435 1 3 0x84 0xd00cd060 0xd03f5ce0 nfsd
422 1 3 0x84 0xd00cd200 0xd014fce0 nfsd
376 1 3 0x84 0xd00cd3a0 0xd03f2ce0 poll
416 1 3 0x84 0xd00cd540 0xd00afce0 select
359 1 3 0x84 0xd00cd6e0 0xd00acce0 select
301 1 3 0x84 0xd00cd880 0xd00a9ce0 poll
246 1 2 0x4 0xd00cdd60 0xd005fce0
63 1 3 0x204 0xd00cda20 0xd00a6ce0 physiod
23 1 3 0x204 0xcf338380 0xd0054ce0 aiodoned
22 1 3 0x204 0xcf338520 0xd0051ce0 syncer
21 1 3 0x204 0xcf3386c0 0xd004ece0 pgdaemon
20 1 3 0x204 0xcf338860 0xd004bce0 raidiow
19 1 3 0x204 0xcf338a00 0xd0048ce0 rfwcond
18 1 3 0x204 0xcf338ba0 0xd0045ce0 raidiow
17 1 3 0x204 0xcf338d40 0xd0042ce0 rfwcond
16 1 3 0x204 0xcf331020 0xd003ace0 sccomp
15 1 3 0x204 0xcf3311c0 0xd0033ce0 crypto_wait
14 1 3 0x204 0xcf331360 0xd0030ce0 cardslotev
13 1 3 0x204 0xcf331500 0xd002dce0 atath
12 1 3 0x204 0xcf3316a0 0xd002ace0 atath
11 1 3 0x204 0xcf331840 0xd0027ce0 atath
10 1 3 0x204 0xcf3319e0 0xd0024ce0 atath
9 1 3 0x204 0xcf331b80 0xd0021ce0 usbevt
8 1 3 0x204 0xcf331d20 0xd001ece0 usbevt
7 1 3 0x204 0xcf321000 0xd001bce0 usbtsk
6 1 3 0x204 0xcf3211a0 0xd0018ce0 usbtsk
5 1 3 0x204 0xcf321340 0xd0015ce0 usbevt
4 1 3 0x204 0xcf3214e0 0xd0012ce0 iicintr
3 1 3 0x204 0xcf321680 0xd000fce0 apmev
2 1 3 0x204 0xcf321820 0xd000cce0 smtaskq
1 1 3 0x84 0xcf3219c0 0xd0009ce0 wait
0 3 1 0x80000205 0xcf321b60 0xcf355ce0
2 7 0xa0000205 0xcf321d00 0xcf29bce0
1 3 0x204 0xc086b860 0xc093cce0 schedule
>How-To-Repeat:
This is on a SMP system with an Athlon 64 X2 cpu:
cpu0: AMD Dual-Core Opteron or Athlon 64 X2 (686-class), 2411.10 MHz, id 0x20f32
cpu0: "AMD Athlon(tm) 64 X2 Dual Core Processor 4600+"
cpu0: AMD Power Management features: f<TTP,VID,FID,TS>
cpu0: AMD Cool`n'Quiet Technology 2400 MHz
cpu0: available frequencies (Mhz): 1000 2400
cpu1 at mainbus0 apid 1: (application processor)
cpu1: AMD Dual-Core Opteron or Athlon 64 X2 (686-class), 2411.01 MHz, id 0x20f32
cpu1: "AMD Athlon(tm) 64 X2 Dual Core Processor 4600+"
cpu1: AMD Power Management features: f<TTP,VID,FID,TS>
Here is a small set of some may 'relevant' kernel options:
options DIAGNOSTIC
options LOCKDEBUG
options DEBUG
options MULTIPROCESSOR
options MPDEBUG
options MPVERBOSE
no pseudo-device fss
no pseudo-device veriexec
Execute the following, where it only happend once since I found the
reason why the 2nd cpu wouldn't work:
sync
umount -a
Panic
I may have a crash dump of this panic. A dump was created after I tried
the sync command a second time, because the first sync didn't work:
db{1}> sync
syncing disks...
simple_lock: locking against myself
lock: 0xc07fa1e4, currently at: /src/sys/kern/vfs_syscalls.c:692
on CPU 1
last locked: /src/sys/kern/vfs_syscalls.c:630
last unlocked: /src/sys/kern/kern_lock.c:628
db_command_table(0,c366e000,c366e048,d6a1ece0,d6a1ec88) at netbsd:__qdivrem+0x26930
Stopped in pid 18146.1 (umount) at netbsd:cpu_Debugger+0x4: popl %ebp
>Fix: