Subject: bin/6065: rebooting is broken in many ways.
To: None <gnats-bugs@gnats.netbsd.org>
From: Lennart Augustsson <augustss@cs.chalmers.se>
List: netbsd-bugs
Date: 08/29/1998 13:39:24
>Number: 6065
>Category: bin
>Synopsis: rebooting is broken in many ways.
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: bin-bug-people (Utility Bug People)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Aug 29 04:50:00 1998
>Last-Modified:
>Originator: Lennart Augustsson
>Organization:
>Release: NetBSD-current 980829
>Environment:
System: NetBSD dogbert.cs.chalmers.se 1.3G NetBSD 1.3G (DOGBERT) #0: Fri Aug 28 16:03:01 CEST 1998 sparud@dogbert.cs.chalmers.se:/usr/src/sys/arch/i386/compile/DOGBERT i386
>Description:
Shutting down NetBSD is deficient in several ways. Here is my list:
1) Reboot dies on all kinds of signals that it might get
during rebooting. I committed a fix to this problem
so I hope we can forget about this one.
2) It takes far too long before the system actually shuts down.
The reason is that reboot(1) sends TSTP to init and the
TERM to all other processes and waits for them to die.
If they have not all died within 30 seconds reboot(1) plunges
ahead and calls reboot(2) anyway. Well, on my machines it
always takes 30 seconds because all process don't die.
What is typically left after a few seconds is init, mount_mfs
and inetd. inetd should shut down on TERM, but maybe something
goes wrong? init should shut down on TERM after TSTP, but
it doesn't. mount_mfs should unmount, but maybe it fails?
3) When it actually comes to shutting down this fails on my machines
(3 i386 and 1 arm32) about once in four. The machine goes
completely catatonic, no 'syncing disks', no getting into
the debugger, only reset helps (with dirty file systems, of
course).
4) If the disk syncing fails then it seems that all file systems
are considered dirty instead of just the affected ones.
This can be very annoying if the failed sync is for an NFS
file system, and you have large local disks.
>How-To-Repeat:
Just do reboot. :-)
>Fix:
1) Fixed.
2) I guess you just have to insert lots of tracing and figure
what keeps those last processes from dying.
3) This is the hard one. I have no idea what goes wrong, but
it happens very often to me.
4) Should be a matter of bookkeeping.
>Audit-Trail:
>Unformatted: