Subject: Re: ffs compatibility added, fsck may complain
To: William Allen Simpson <wsimpson@greendragon.com>
From: Darrin B. Jewell <dbj@netbsd.org>
List: current-users
Date: 03/14/2004 13:30:02
One more thing I noticed after re-reading Perry's scenario:
I would copy the fsck_ffs.repair temporarily into place as
/sbin/fsck_ffs or else mark the filesystems as `do not fsck' in
/etc/fstab until the system reboots and it is safe to upgrade the rest
of your userland.
Darrin
"Darrin B. Jewell" <dbj@netbsd.org> writes:
> I've been meaning to add an option to fsck to downgrade the
> filesystem, but even that won't completely help your blind upgrade
> problem. The real answer is that booting a broken -current kernel in
> a blind upgrade situation is dangerous, but that happenned months
> ago, and we're past that already.
>
> So, my current best guess as to a blind upgrade path would
> be something like:
>
> Modifty a -current fsck_ffs so that it does not try to
> remount a filesystem after it has repaired it. The following
> patch should do this:
>
> --- src/sbin/fsck_ffs/main.c.~1.49.~ Sat Jan 17 17:17:07 2004
> +++ src/sbin/fsck_ffs/main.c Sun Mar 14 11:13:24 2004
> @@ -373,7 +373,7 @@
> pwarn("\n***** FILE SYSTEM WAS MODIFIED *****\n");
> if (rerun)
> pwarn("\n***** PLEASE RERUN FSCK *****\n");
> - if (hotroot()) {
> + if (0 && hotroot()) {
> struct statfs stfs_buf;
> /*
> * We modified the root. Do a mount update on
>
> Compile that fsck_ffs on your existing system by cd'ing
> into src/sbin/fsck_ffs and running:
> make USETOOLS=no DESTDIR=/
> cp ./fsck_ffs /root/fsck_ffs.repair
> make USETOOLS=no DESTDIR=/ cleandir
>
> Compile a -current kernel.
> Stop all processes you can without removing access to the machine.
> unmount all possible filesytems except for / and /usr
> Copy the new kernel into place.
> for the root or /usr filesystems, downgrade the mount to read only.
> If the mount downgrades fail, don't try to force them. Put
> the old kernel back in place and look around for processes that
> are still writing the disk.
>
> Wave a few dead chickens. Type sync. Wait a few seconds. Type it
> again. It should return immediately. If you can, verify that sync
> does not introduce any disk activity. iostat -x can be useful for
> this.
>
> Verify that the fileystems are clean and up to date on disk first
> by running
> /root/fsck_ffs.repair -n -f -b 16 -c 3
> If this reports any required changes to the filesystems, stop.
> Re-mount the filesystem read-write and put the running
> kernel back in place. Ask here about how to proceed before
> continuing with the upgrade.
>
> Upgrade the filesystems by running:
> /root/fsck_ffs.repair -b 16 -c 4
> on the raw devices of your filesystems.
>
> Be gentle while doing this. You don't want the old kernel
> to try to access any newly upgraded superblocks. Even a read
> only access may cause a panic. Hopefully, for filesystems
> which are already mounted read-only, it will not need to
> go back to disk to re-read the superblock.
>
> reboot
>
> I would test this upgrade path on systems that are not
> in a blind upgrade situation first. I have not tested
> this upgrade path.
>
> Good luck.
>
> Darrin
>
> William Allen Simpson <wsimpson@greendragon.com> writes:
>
> > I never saw an answer to Perry (and my and I'm sure many others)
> > problem with blind updating co-lo space to more recent -current:
> >
> > "Perry E. Metzger" wrote:
> > >
> > > Also, the situation is REALLY unfortunate. It means that you're going
> > > to end up with machines mysteriously failing on people without much
> > > recourse in the field if you don't happen to remember the cure. Also,
> > > people needing to blind upgrade boxes in colos will get screwed -- I'm
> > > one of those.
> > >
> > > Is there any way to either get the kernel to fix this for you during
> > > boot, or to provide a way to fix it in advance so that fsck doesn't
> > > fail during reboot? This is actually pretty important.
> > >
> > I'm trying to get ready to test -current in preparation for 2.0, but
> > I'm not sure that everything will be hunky-dory after simply installing
> > a new kernel, reboot, tar zxpf base.tgz et alia, reboot.
> >
> > As Perry suggests, is there a way to fix it in advance?
> > --
> > William Allen Simpson
> > Key fingerprint = 17 40 5E 67 15 6F 31 26 DD 0D B9 9B 6A 15 2C 32