Subject: Is 1.6 NFS buggy?
To: None <current-users@netbsd.org>
From: None <kpneal@pobox.com>
List: current-users
Date: 11/29/2002 23:15:52
I've got a crash on my Alpha running 1.6. The crash is in NFS and I'm
wondering if anyone else has seen it.
Hand-written traceback:
nfs_reclaim+0x80
vclean+0x258
vgonel+0x70
getnewvnode+0x310
ffs_vget+0x8c
vfs_lookup+0x1028
lookup+0x4bc
namei+0x4c8
sys__lstat13+0x58
syscall_plain+0x154
syscall 280
The offending line of code:
/usr/src/sys16/nfs/nfs_node.c:285
93c: 00 00 6a a0 ldl t2,0(s1)
940: 83 16 61 48 srl t2,0x8,t2 ********* Blamo!
944: 18 00 60 e0 blbc t2,9a8 <nfs_reclaim+0xe8>
948: 88 00 49 a4 ldq t1,136(s0)
94c: 16 00 40 e4 beq t1,9a8 <nfs_reclaim+0xe8>
/usr/src/sys16/nfs/nfs_node.c:286
/*
* For nqnfs, take it off the timer queue as required.
*/
---> if ((nmp->nm_flag & NFSMNT_NQNFS) && np->n_timer.cqe_next != 0) {
CIRCLEQ_REMOVE(&nmp->nm_timerhead, np, n_timer);
}
The result is this nastyness:
CPU 0: fatal kernel trap:
CPU 0 trap entry = 0x4 (unaligned access fault)
CPU 0 a0 = 0xdeadbeefdeadbeef
CPU 0 a1 = 0x28
CPU 0 a2 = 0x3
CPU 0 pc = 0xfffffc0000351e20
CPU 0 ra = 0xfffffc0000448f38
CPU 0 pv = 0xfffffc0000351da0
CPU 0 curproc = 0xfffffc0000a37448
CPU 0 pid = 17495, comm = find
panic: trap
tlp0: receive ring overrun
tlp1: receive ring overrun
syncing disks... panic: lockmgr: locking against myself
dumping to dev 8,9 offset 298595
dump 48 47 46 45 44 43 42 41 40 39 38 37 36 35
unexpected machine check:
mces = 0x1
vector = 0x670
param = 0xfffffc0000006048
pc = 0xfffffc0000307be8
ra = 0xfffffc0000307bd4
code = 0x100000084
curproc = 0xfffffc0000a37448
pid = 17495, comm = find
panic: machine check
dumping to dev 8,9 offset 298595
dump device not ready
--
Kevin P. Neal http://www.pobox.com/~kpn/
"It sounded pretty good, but it's hard to tell how it will work out
in practice." -- Dennis Ritchie, ~1977, "Summary of a DEC 32-bit machine"