Subject: kern/766: union mounts can get UFS locking panics or other deadlocks
To: None <gnats-admin@NetBSD.ORG>
From: John Kohl <jtk@kolvir.blrc.ma.us>
List: netbsd-bugs
Date: 01/29/1995 19:05:06
>Number: 766
>Category: kern
>Synopsis: union mounts can get UFS locking panics or other deadlocks
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people (Kernel Bug People)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Jan 29 19:05:04 1995
>Originator: John Kohl
>Organization:
NetBSD Kernel Hackers `R` Us
>Release: 1.0A
>Environment:
System: NetBSD kolvir 1.0A NetBSD 1.0A (KOLVIR) #21: Sun Jan 29 20:59:56 EST 1995 jtk@kolvir:/u1/NetBSD-current/src/sys/arch/i386/compile/KOLVIR i386
>Description:
The latest union mount code in -current has a locking misordering which
can lead to panics like the one from ufs_lock():
panic("locking against myself");
The basic problem is a misordered flag manipulation in union_lock().
>How-To-Repeat:
Compile a DIAGNOSTIC kernel.
Mount something with unions. Make sure the top level doesn't have a
shadow directory tree yet.
cd to the union mount, then run two copies of 'find . -ls >/dev/null'.
Watch the kernel panic in short order.
>Fix:
Here's a patch which adds some comments of what's going on in a few
cases that confused me, plus a patch for the bug in ufs_lock(), plus a
patch to the ordering of some flag manipulation in
union_removed_upper(): the CACHED flags really should be manipulated
before we drop the uppervp lock.
===================================================================
RCS file: miscfs/union/RCS/union_vnops.c,v
retrieving revision 1.1
diff -ubw -r1.1 miscfs/union/union_vnops.c
--- 1.1 1995/01/29 20:26:30
+++ miscfs/union/union_vnops.c 1995/01/30 01:54:18
@@ -319,6 +319,11 @@
/* case 2. */
if (uerror != 0 /* && (lerror == 0) */ ) {
if (lowervp->v_type == VDIR) { /* case 2b. */
+ /*
+ * We may be racing another process to make the
+ * upper-level shadow directory. Be careful with
+ * locks/etc!
+ */
dun->un_flags &= ~UN_ULOCK;
VOP_UNLOCK(upperdvp);
uerror = union_mkshadow(um, upperdvp, cnp, &uppervp);
@@ -1353,12 +1358,20 @@
if (un->un_uppervp != NULLVP) {
if (((un->un_flags & UN_ULOCK) == 0) &&
(vp->v_usecount != 0)) {
- un->un_flags |= UN_ULOCK;
+ /*
+ * We MUST always use the order of: take upper
+ * vp lock, manipulate union node flags, drop
+ * upper vp lock. This code must not be an
+ * exception.
+ */
VOP_LOCK(un->un_uppervp);
+ un->un_flags |= UN_ULOCK;
}
#ifdef DIAGNOSTIC
- if (un->un_flags & UN_KLOCK)
- panic("union: dangling upper lock");
+ if (un->un_flags & UN_KLOCK) {
+ vprint("dangling klock", vp);
+ panic("union: dangling upper klock (%lx)", vp);
+ }
#endif
}
@@ -1384,6 +1397,18 @@
return (0);
}
+/*
+ * When operations want to vput() a union node yet retain a lock on
+ * the upper VP (say, to do some further operations like link(),
+ * mkdir(), ...), they set UN_KLOCK on the union node, then call
+ * vput() which calls VOP_UNLOCK() and comes here. union_unlock()
+ * unlocks the union node (leaving the upper VP alone), clears the
+ * KLOCK flag, and then returns to vput(). The caller then does whatever
+ * is left to do with the upper VP, and insures that it gets unlocked.
+ *
+ * If UN_KLOCK isn't set, then the upper VP is unlocked here.
+ */
+
int
union_unlock(ap)
struct vop_lock_args *ap;
@@ -1453,6 +1478,10 @@
printf("\ttag VT_UNION, vp=%x, uppervp=%x, lowervp=%x\n",
vp, UPPERVP(vp), LOWERVP(vp));
+ if (UPPERVP(vp))
+ vprint("union_upper", UPPERVP(vp));
+ if (LOWERVP(vp))
+ vprint("union_lower", LOWERVP(vp));
return (0);
}
===================================================================
--- 1.1 1995/01/29 20:32:22
+++ miscfs/union/union_subr.c 1995/01/30 02:44:36
@@ -949,15 +949,14 @@
union_removed_upper(un)
struct union_node *un;
{
+ if (un->un_flags & UN_CACHED) {
+ un->un_flags &= ~UN_CACHED;
+ LIST_REMOVE(un, un_cache);
+ }
if (un->un_flags & UN_ULOCK) {
un->un_flags &= ~UN_ULOCK;
VOP_UNLOCK(un->un_uppervp);
- }
-
- if (un->un_flags & UN_CACHED) {
- un->un_flags &= ~UN_CACHED;
- LIST_REMOVE(un, un_cache);
}
}
>Audit-Trail:
>Unformatted: