Subject: kern/3006: union_lookup possiblly goes into dead lock.
To: None <gnats-bugs@gnats.netbsd.org>
From: =?ISO-2022-JP?B?Ik1JTk9VUkEgTWFrb3RvIC8gGyRCTCcxOhsoQiAbJEI/PxsoQiI=?= <minoura@kw.netlaputa.or.jp>
List: netbsd-bugs
Date: 12/08/1996 17:31:23
>Number: 3006
>Category: kern
>Synopsis: union_lookup possiblly goes into dead lock.
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people (Kernel Bug People)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Dec 8 04:05:00 1996
>Last-Modified:
>Originator: Naofumi HONDA / MINOURA Makoto
>Organization:
NetBSD/pc98 Core Team
>Release: Nov. 30, '96
>Environment:
System: NetBSD daisy 1.2B NetBSD 1.2B (DAISY) #1: Sun Dec 1 00:32:28 JST 1996 root@daisy:/usr/src/sys/arch/i386/compile/DAISY i386
>Description:
union fs locking protocol is very very complicated and
still has some posibility of dead lock. Here are examples
we've found:-
- union_lookup and union_root may try to lock two vnodes
simultaneously.
- While union_lookup is locking a directory, union_fsync
for this directory may called, which causes deadlock.
>How-To-Repeat:
Timing problem.... Execute find(1) on busy union filesystem or
something like that...
>Fix:
For ideal solution, we should rework the union locking protocol.
The following is somewhat dirty quick hack written by Naofumi
HONDA originally for NetBSD/pc98.
*** /sys/miscfs/union/union_vfsops.c.ORIG Tue Oct 15 01:41:27 1996
--- /sys/miscfs/union/union_vfsops.c Wed Dec 4 22:36:48 1996
***************
*** 373,378 ****
--- 373,396 ----
struct union_mount *um = MOUNTTOUNIONMOUNT(mp);
int error;
int loselock;
+ int lockadj = 0;
+
+ /* XXX:
+ * Assume mount -t union /usr/obj /usr/src.
+ * When process A looks up /usr/src,
+ * then the union upper vnode is locked first.
+ * Here, another process B calls union_root.
+ * B locks the lower vnode and tries to lock the upper.
+ * However the upper vnode is already locked by A.
+ * Moreover, A tries to lock the lower vnode,
+ * which is already locked by B!!
+ */
+ if (um->um_lowervp && um->um_op != UNMNT_BELOW &&
+ VOP_ISLOCKED(um->um_lowervp)) {
+ VREF(um->um_lowervp);
+ VOP_UNLOCK(um->um_lowervp);
+ lockadj = 1;
+ }
/*
* Return locked reference to root.
***************
*** 404,409 ****
--- 422,432 ----
} else {
if (loselock)
VTOUNION(*vpp)->un_flags &= ~UN_ULOCK;
+ }
+
+ if (lockadj) {
+ VOP_LOCK(um->um_lowervp);
+ vrele(um->um_lowervp);
}
return (error);
*** union/union_vnops.c.ORIG Tue Oct 15 01:42:03 1996
--- union_vnops.c Fri Dec 6 22:26:08 1996
***************
*** 1024,1042 ****
struct proc *a_p;
} */ *ap = v;
int error = 0;
struct vnode *targetvp = OTHERVP(ap->a_vp);
if (targetvp != NULLVP) {
int dolock = (targetvp == LOWERVP(ap->a_vp));
if (dolock)
VOP_LOCK(targetvp);
! else
! FIXUP(VTOUNION(ap->a_vp));
error = VOP_FSYNC(targetvp, ap->a_cred,
ap->a_waitfor, ap->a_p);
if (dolock)
VOP_UNLOCK(targetvp);
}
return (error);
--- 1024,1064 ----
struct proc *a_p;
} */ *ap = v;
int error = 0;
+ int lockadj = 0;
+ struct union_node *un;
struct vnode *targetvp = OTHERVP(ap->a_vp);
if (targetvp != NULLVP) {
int dolock = (targetvp == LOWERVP(ap->a_vp));
+ un = VTOUNION(ap->a_vp);
if (dolock)
VOP_LOCK(targetvp);
! else if ((un->un_flags & UN_ULOCK) == 0 &&
! VOP_ISLOCKED(targetvp) == 0) {
! /* XXX:
! * To avoid this kind of deadlock, check if
! * someone is locking. Actually, we have to check
! * who is locking...
! * 1) union_lookup XXX. union_lookup calls ufs_lookup,
! * and locks XXX.
! * 2) Here, imagine VOP_FSYNC is called from vinvalbuf.
! * 3) union_fsync tries to lock the XXX vnode.
! * Boo Boo!!!
! */
! lockadj = 1;
! VOP_LOCK(targetvp);
! un->un_flags |= UN_ULOCK;
! }
!
error = VOP_FSYNC(targetvp, ap->a_cred,
ap->a_waitfor, ap->a_p);
if (dolock)
VOP_UNLOCK(targetvp);
+ if (lockadj) {
+ VOP_UNLOCK(targetvp);
+ un->un_flags &= ~UN_ULOCK;
+ }
}
return (error);
>Audit-Trail:
>Unformatted: