Subject: Re: Playing with dkwedge
To: Bill Studenmund <wrstuden@netbsd.org>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: tech-kern
Date: 08/24/2005 21:49:35
On Wed, Aug 24, 2005 at 12:37:51PM -0700, Bill Studenmund wrote:
>
> Yes and no. That's what softdeps uses snapshots for, or one of the things
> it uses them for. However snapshots are more for being able to make
> self-consistent backups and for simple "undelete" (deleted something by
> mistake? Chances are it's in the snapshot, so just bring it back).
I meant snapshot are made to take a fixed, consistent image of a live
filesystem. So it's possible to run fsck on this to detect problems.
Of course there are other uses too :)
>
> The problem I see with what you're proposing, using fsck as a disk
> reliability verification tool, is that that's not what it was designed
> for. While I do not doubt that it really really helped you, I do not think
> we should make this a recomended practice.
>
> If you (or I) really care about the data, we should be using a RAID 5 or
> better. And we should have a program that verifies parity. Not just reads
> the whole disk, but verifies each stripe's parity. Run it say once a week
> on the whole array, and things are good.
Yes, that would be the best choise. Unless you're using a hardware RAID,
in which case you can't do this check (and if the hardware controller does,
you have to trust it).
>
> The problem with fsck is that you really just got lucky. fsck wouldn't
> notice if the cache messed up reading file data. It also won't really
> notice (AFAICT) if it gets passed incorrect-but-sensible-looking data at
> certain points.
Yes, I was lucky. But I think fsck does enouth checks that random
corruption caused by hardware problems will detected quickly.
And, we can't afford to use ECC memory and RAID everywhere. A periodic fsck
helps detect hardware problems (not talking about software bugs :), and
it would be a shame to loose this.
>
> So if we want to do something, let's use the right tool for it.
>
> > BTW, we should probably add the -x and -X options to fsck, similar to
> > dump(8).
>
> What options are those? I do not see them in our dump(8).
From a 3.0_BETA system:
-x snap-backup
Use a snapshot with snap-backup as backup for this dump. See
fss(4) for more details. Snapshot support is experimental. Be
sure you have a backup before you use it.
-X Similar to -x but uses a file system internal snapshot on the
file system to be dumped.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--