Subject: kern/2258: Mounting nullfs over NFS repeatably panics vnode_pager_uncache
To: None <gnats-bugs@NetBSD.ORG>
From: None <jonathan@DSG.Stanford.EDU>
List: netbsd-bugs
Date: 03/24/1996 18:28:39
>Number: 2258
>Category: kern
>Synopsis: Mounting nullfs over NFS repeatably panics vnode_pager_uncache
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: kern-bug-people (Kernel Bug People)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Mar 24 21:50:01 1996
>Last-Modified:
>Originator: Jonathan Stone
>Organization:
Stanford DSG
>Release: 1.1
>Environment:
System: NetBSD Bowl.DSG.Stanford.EDU 1.1B NetBSD 1.1B (DSG) #0: Sun Mar 17 20:44:01 PST 1996 jonathan@Bowl.DSG.Stanford.EDU:/usr/src/sys/arch/i386/compile/DSG i386
>Description:
vnode_pager_uncache() panics when unmounting null filesystems mounted
over an NFS filesystem, in a kernel where DEBUG is defined.
I see two problems with this:
a. The situation for which the DEBUG code is checking
should, perhaps (I don't know for sure) be handled
by the filesystem underlying the NULL filesystem,
not the NULL filesystem itself. Certainly,
if NFS filesytems need to be special-cased, then
NULL-over-NFS filesystems seem to need to be
special-cased, too.
b. panicking, in that situation, is appropriate for
a kernel configured with DIAGNOSTIC,, but not for a
kernel configured with DEBUG.
"DEBUG isn't supposed to do that".
Due to my own local setup, this panic happens _every_ _time_ I reboot,
which is unpleasant, to say the least.
>How-To-Repeat:
Put NetBSD source on an NFS server.
Mount that NFS server, on a NetBSD client, somewhere other
than /usr/src.
Use nullfs to mount the source server under /usr/src, to get a
"correct" canonical pathname into the kernel ident string and
debugging source pathnames into binaries (e.g., for making
distributions or snapshots.
Build a kernel (or do anything that involves unlinking a file.
Reboot. Note the panic and subsequent fscks. This gets boring
very fast, especially when debugging kernels.
>Fix:
The following patch below has worked for me for months. (I thought
I send-pr'ed it in Feburary, but I can't find it in the database.)
The patch changes the panic() to a printf(), unless DIAGNOSTIC
is defined; and extends the special-case check for filesystems
that are "expected" to have unlocked vnodes, to include the null
filesystem.
*** vnode_pager.c.DIST Sat Feb 10 04:35:05 1996
--- vnode_pager.c.dsg Mon Feb 12 12:07:35 1996
***************
*** 464,470 ****
pager = (vm_pager_t)vp->v_vmdata;
if (pager == NULL)
return (TRUE);
! #ifdef DEBUG
if (!VOP_ISLOCKED(vp)) {
#ifdef NFSCLIENT
extern int (**nfsv2_vnodeop_p) __P((void *));
--- 464,470 ----
pager = (vm_pager_t)vp->v_vmdata;
if (pager == NULL)
return (TRUE);
! #if defined(DIAGNOSTIC) || defined(DEBUG)
if (!VOP_ISLOCKED(vp)) {
#ifdef NFSCLIENT
extern int (**nfsv2_vnodeop_p) __P((void *));
***************
*** 472,489 ****
#ifdef FIFO
extern int (**fifo_nfsv2nodeop_p) __P((void *));
#endif
if (vp->v_op != nfsv2_vnodeop_p
&& vp->v_op != spec_nfsv2nodeop_p
#ifdef FIFO
&& vp->v_op != fifo_nfsv2nodeop_p
! #endif
)
! #endif
panic("vnode_pager_uncache: vnode not locked!");
! }
#endif
/*
* Must use vm_object_lookup() as it actually removes
* the object from the cache list.
--- 472,502 ----
#ifdef FIFO
extern int (**fifo_nfsv2nodeop_p) __P((void *));
#endif
+ #ifdef NULLFS
+ extern int (**null_vnodeop_p)();
+ #endif
if (vp->v_op != nfsv2_vnodeop_p
&& vp->v_op != spec_nfsv2nodeop_p
#ifdef FIFO
&& vp->v_op != fifo_nfsv2nodeop_p
! #endif /*FIFO*/
! #ifdef NULLFS
! && vp->v_op != null_vnodeop_p /*XXX too inclusive?*/
! #endif /*NULLFS*/
)
! #endif /*NFSCLIENT*/
!
! #ifdef DIAGNOSTIC
panic("vnode_pager_uncache: vnode not locked!");
! #else
! printf("vnode_pager_uncache: vnode not locked!");
#endif
+ }
+
+ #endif /* defined(DIAGNOSTIC) || defined(DEBUG) */
+
/*
* Must use vm_object_lookup() as it actually removes
* the object from the cache list.
>Audit-Trail:
>Unformatted: