netbsd-bugs: kern/1429: mfs can hang at shutdown (+ fix)

Subject: kern/1429: mfs can hang at shutdown (+ fix)
To: None <gnats-bugs@gnats.netbsd.org>
From: John Kohl <jtk@kolvir.arlington.ma.us>
List: netbsd-bugs
Date: 08/31/1995 21:27:46
>Number:         1429
>Category:       kern
>Synopsis:       mfs can hang at shutdown (+ fix)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Aug 31 21:50:02 1995
>Last-Modified:
>Originator:     John Kohl
>Organization:
NetBSD Kernel Hackers `R` Us
>Release:        NetBSD-current as of 31 August 1995
>Environment:
	
System: NetBSD kolvir 1.0A NetBSD 1.0A (KOLVIR) #624: Wed Aug 9 07:38:58 EDT 1995 jtk@pattern:/u1/NetBSD-current/src/sys/arch/i386/compile/KOLVIR i386

>Description:

When shutting down, it's possible for an MFS file system to hang in
unmount.  The problem is that the MFS process is awoken to process file
I/O and then later delivered a signal before it gets onto the
processor.  tsleep() then returns indicating the signal delivery, and
MFS forgets to check its I/O queue.  It then tries to unmount itself,
which hangs down in spec_fsync() because it's waiting for I/O to
complete on its "device" vnode.  The stack trace at this point looks
like:

db> tr *0xf9dbd03c
bpendsleep(f877bfb4,0,f9dbed9c,f8137006,f877bfb4) at bpendsleep
bpendsleep(f877bfb4,11) at bpendsleep
_spec_fsync(f9dbedb8,f877d200,0,f877b700,f877d000) at _spec_fsync+0xa6
_ffs_sync(f877d200,1,f873d480,f8782b00,f877d200) at _ffs_sync+0x142
_dounmount(f877d200,0,f8782b00) at _dounmount+0x64
_mfs_start(f877d200,0,f8782b00,f877d200,f877d21c) at _mfs_start+0x91
_mount(f8782b00,f9dbef84,f9dbef7c,0,1da94) at _mount+0x476
_syscall() at _syscall+0x239
--- syscall (number 21) ---
0x6e57:

and its device vnode is:
db> call vprint(0,0xf877bf80)
type VBLK, usecount 5, writecount 0, refcount 4, flags (VBWAIT)
        tag VT_MFS, pid 20, base 200704, size 5242880

If I examine the vnode's data field as an mfsnode, it has entries on its
I/O queue:
db> x/x 0xf8783b80
0xf8783b80:         f877bf80
db> 
0xf8783b84:            31000
db> 
0xf8783b88:           500000
db> 
0xf8783b8c:               14
db> 
0xf8783b90:         f8d405cc		<< this guy is the mfs_buflist member!
db> 
0xf8783b94:         deadbeef
db> 
0xf8783b98:         deadbeef
db> 
0xf8783b9c:         deadbeef
db> 


>How-To-Repeat:
Run lots of stuff, using MFS for /tmp.  Shutdown with "shutdown -r
now".  Get unlucky in the order of process activation and signal
delivery, and your shutdown hangs.

>Fix:

scan the I/O queue before trying unmount.  I had a debug printf in there
once that actually spit out some I/O messages right before one unmount
attempt, so I'm fairly certain this fix works.

*** mfs_vfsops.c	1995/09/01 00:42:44	1.1
--- mfs_vfsops.c	1995/09/01 01:17:38
***************
*** 272,291 ****
  
  	base = mfsp->mfs_baseoff;
  	while (mfsp->mfs_buflist != (struct buf *)(-1)) {
! 		while (bp = mfsp->mfs_buflist) {
! 			mfsp->mfs_buflist = bp->b_actf;
! 			mfs_doio(bp, base);
! 			wakeup((caddr_t)bp);
  		}
  		/*
  		 * If a non-ignored signal is received, try to unmount.
  		 * If that fails, clear the signal (it has been "processed"),
  		 * otherwise we will loop here, as tsleep will always return
  		 * EINTR/ERESTART.
  		 */
! 		if (error = tsleep((caddr_t)vp, mfs_pri, "mfsidl", 0))
  			if (dounmount(mp, 0, p) != 0)
  				CLRSIG(p, CURSIG(p));
  	}
  	return (error);
  }
--- 272,295 ----
  
  	base = mfsp->mfs_baseoff;
  	while (mfsp->mfs_buflist != (struct buf *)(-1)) {
! #define DOIO() \
! 		while (bp = mfsp->mfs_buflist) { \
! 			mfsp->mfs_buflist = bp->b_actf; \
! 			mfs_doio(bp, base); \
! 			wakeup((caddr_t)bp); \
  		}
+ 		DOIO();
  		/*
  		 * If a non-ignored signal is received, try to unmount.
  		 * If that fails, clear the signal (it has been "processed"),
  		 * otherwise we will loop here, as tsleep will always return
  		 * EINTR/ERESTART.
  		 */
! 		if (error = tsleep((caddr_t)vp, mfs_pri, "mfsidl", 0)) {
! 			DOIO();
  			if (dounmount(mp, 0, p) != 0)
  				CLRSIG(p, CURSIG(p));
+ 		}
  	}
  	return (error);
  }

>Audit-Trail:
>Unformatted: