Re: Lost file-system story

To: Donald Allen <donaldcallen%gmail.com@localhost>
Subject: Re: Lost file-system story
From: "Greg A. Woods" <woods%planix.ca@localhost>
Date: Fri, 09 Dec 2011 17:43:01 -0800

At Fri, 9 Dec 2011 15:50:35 -0500, Donald Allen 
<donaldcallen%gmail.com@localhost> wrote:
Subject: Re: Lost file-system story
> 
> "does not guarantee to keep a consistent file system structure on the
> disk" is what I expected from NetBSD. From what I've been told in this
> discussion, NetBSD pretty much guarantees that if you use async and
> the system crashes, you *will* lose the filesystem if there's been any
> writing to it for an arbitrarily long period of time, since apparently
> meta-data for async filesystems doesn't get written as a matter of
> course.

I'm not sure what the difference is.  You seem to be quibbling over
minor differences and perhaps one-off experiences.  Both OpenBSD and
NetBSD also say that you should not use the "async" flag unless you are
prepared to recreate the file system from scratch if your system
crashes.  That means use newfs(8) [and, by implication, something like
restore(8)], not fsck(8), to recover after a crash.  You got lucky with
your test on OpenBSD.

> And then there's the matter of NetBSD fsck apparently not
> really being designed to cope with the mess left on the disk after
> such a crash. Please correct me if I've misinterpreted what's been
> said here (there have been a few different stories told, so I'm trying
> to compute the mean).

That's been true of Unix (and many unix-like) filesystems and their
fsck(8) commands since the beginning of Unix.

fsck(8) is designed to rely on the possible states of on-disk filesystem
metadata because that's now Unix-based filesystems have been guaranteed
to work (barring use of MNT_ASYNC, obviously).

And that's why by default, and by very strong recommendation, filesystem
metadata for Unix-based filesystems (sans WABPL) should always be
written synchronously to the disk if you ever hope to even try to use
fsck(8).

> I am not telling the OpenBSD story to rub NetBSD peoples' noses in it.
> I'm simply pointing out that that system appears to be an example of
> ffs doing what I thought it did and what I know ext2 and journal-less
> ext4 do -- do a very good job of putting the world into operating
> order (without offering an impossible guarantee to do so) after a
> crash when async is used, after having been told that ffs and its fsck
> were not designed to do this.

You seem to be very confused about what MNT_ASYNC is and is not.  :-)

Unix filesystems, including Berkeley Fast File System variant, have
never made any guarantees about the recoverability of an async-mounted
filesystem after a crash.

You seem to have inferred some impossible capability based on your
experience with other non-Unix filesystems that have a completely
different internal structure and implementation from the Unix-based
filesystems in NetBSD.

Perhaps the BSD manuals have assumed some knowledge of Unix history, but
even the NetBSD-1.6 mount(8) manual, from 2002, is _extremely_ clear
about the dangers of the "async" flag, with strong emphasis in the
formatted text on the relevant warning:

     async       All I/O to the file system should be done asyn-
                 chronously.  In the event of a crash, _it_is_
                 _impossible_for_the_system_to_verify_the_integrity_of_
                 _data_on_a_file_system_mounted_with_this_option._  You
                 should only use this option if you have an applica-
                 tion-specific data recovery mechanism, or are willing
                 to recreate the file system from scratch.

According to CVS that wording has not changed since October 1, 2002, and
the emphasised text has been there unchanged since September 16, 1998.

> So I'd love it if my experience encourages someone to improve NetBSD
> ffs and fsck to make use of async practical

As others have already said, this has already been done.  It's called
WABPL.  See wapbl(4) for more information.  Use "mount -o log" to enable
it.

(BTW, I personally don't think you would want to use softdep -- it can
suffer almost as badly as async after a crash, though perhaps without
totally invalidating fsck(8)'s ability to at least recover files and
directories which were static since mount; and it does also offer vastly
improved performance in many use cases, but as the manual says, it
should still be used with care (i.e. recognition of the risks of
less-tested, much more complex code, and vastly changed internal
implmentation semantics implying radically different recovery modes.)

-- 
                                                Greg A. Woods
                                                Planix, Inc.

<woods%planix.com@localhost>       +1 250 762-7675        http://www.planix.com/

Attachment: pgpVI1s5uad31.pgp
Description: PGP signature

Follow-Ups:
- Re: Lost file-system story
  - From: Donald Allen

References:
- Lost file-system story
  - From: Donald Allen
- Re: Lost file-system story
  - From: David Holland
- Re: Lost file-system story
  - From: Donald Allen

Prev by Date: Re: Patch: new random pseudodevice
Next by Date: Re: Patch: new random pseudodevice
Previous by Thread: Re: Lost file-system story
Next by Thread: Re: Lost file-system story
Indexes:

Home | Main Index | Thread Index | Old Index