Subject: Re: nfs locking panic
To: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
From: Bill Sommerfeld <sommerfeld@orchard.arlington.ma.us>
List: current-users
Date: 11/17/1999 09:09:03
> I am trying to netboot an SE/30 from a Sun IPX. Frequently, the SE/30 only
> gets to "Building databases..." and then the server panics:
>
> login: panic: nfsd: locking botch in op 3
> Stopped in nfsd at Debugger+0x4: jmpl [%o7 + 0x8], %g0
> db> t
> nfssvc_nfsd(0x0, 0x2, 0xf0387c78, 0xf0190528, 0xf019c040, 0xf9a0edc0) at nfssvc
> _nfsd+0x6a4
> sys_nfssvc(0x0, 0xf9a0ef28, 0xf9a0ef20, 0xf00b1474, 0xeffffa50, 0xf9a0efb0)
> at s
> ys_nfssvc+0x5b8
> syscall(0x9b, 0xf9a0efb0, 0x0, 0x1, 0x0, 0xf9a0efb0) at syscall+0x1fc
> _syscall(0x4, 0x21b10, 0x18, 0x10c60, 0x217b0, 0x10108) at _syscall+0x120
> db>
>
> Userland is from an 8/99 binary snapshot, kernel is "NetBSD 1.4K (SPARKLE)
> #6: Fri Oct 1 23:29:06 CEST 1999".
This is a panic from a diagnostic check I added recently.
As the comment above the panic says:
/*
* NFS server procs should neither release
* locks already held, nor leave things
* locked. Catch this sooner, rather than
* later (when we try to relock something we
* already have locked). Careful inspection
* of the failing routine usually turns up the
* lock leak.. once we know what it is..
*/
and later..
/*
* If you see this panic, audit
* nfsrv3_procs[nd->nd_procnum] for vnode
* locking errors (usually, it's due to
* forgetting to vput() something).
*/
op #3 is NFSPROC_LOOKUP
I added this check to -current after we spent a fair amount of time
tearing our hair out looking for problems of this form (one or another
NFS ops leaving vnodes locked).
knowing the value of p->p_locks and lockcount at this point will be
useful (just to verify that something was left locked, as opposed to
something extra being unlocked); also, in a DEBUG kernel, try calling
printlockedvnodes() at this point..
- Bill