Subject: Re: parity check with root on raid
To: None <netbsd-help@netbsd.org>
From: Greg Oster <oster@cs.usask.ca>
List: netbsd-help
Date: 04/21/2005 08:55:31
=3D?ISO-8859-1?Q?Ari_Sovij=3DE4rvi?=3D writes:T
> =
> > on a current i386 system I have almost all filesystems (incl. /) on a=
> > RAIDframe (level 1). The RAID set is automatically configured.
> > I don't understand why the rc.d raidframeparity check is done so late=
:
> =
> I've been wondering the same thing, as Linux for example initiates the =
rebuild =
> right after the autoconfiguring arrays have been detected and assembled=
=2E
> =
> > Shouldn't parity be checked (and possibly be rewritten) before filesy=
stems
> > are checked and mounted?
In theory, yes, but if you have a huge array that might take hours to =
check, you probably don't want the unavailable for that long. The =
time it takes to do the check is the time your data is "unprotected" =
against a component failure, so whether you want to be "live" during =
is the question... =
> AFAIK the parity check is transparent, so once you initate it, you can =
go on =
> checking the disk and making modifications to it. I took a look at your=
patch
> , =
> and I'd leave the ") &" line as it was.
Note that if you are doing a fsck at the same time as doing a parity =
check that they will be fighting against each other, and the fsck =
will take much longer than normal to complete. If we ever get a =
filesystem that doesn't require a long fsck, then we'd certainly want =
to move the parity check to make it occur as early as possible.
=
> Running FSCK on a RAIDframe array before parity check seems to actually=
corrupt =
> the array.
Ummmm... "no". At least, it's not supposed to, is something I've =
never seen happen in all the testing I've done, and if it is =
happening, RAIDframe should be disabled completely until the problem =
is fixed.
> Here's what happened to me:
> =
> The machine crashed under heavy load (updating pkgsrc & compiling somet=
hing).
> =
> When it came up again, FSCK found lots of problems that got fixed =
> (which raises another question, shouldn't soft depensies prevent this?)=
=2E
> After FSCK RAIDframe parity was rewritten. After I logged in, I immedia=
tely
> rebooted the machine and surprise, FSCK found a whole new set of troubl=
e.
One fsck doesn't always find all the problems. If you have a really =
nasty crash, multiple "fsck -f"'s might be needed before fsck doesn't =
find any further errors. Just because you did one fsck doesn't mean =
it fixed all the problem! (I wouldn't go blaming RAIDframe if you =
just did a single fsck after a nasty crash.)
> I thought about this for a while and came up with a theory, that the co=
ntent =
> on the disk were different at the time of FSCK. Maybe some data got wri=
tten
> on one disk before the crash so that the actualy filesystem metadata di=
dn't =
> match any more. So when FSCK was checking the array, it fixed (or left
> unfixed) something that wasn't the same on both disks. And when RAIDfra=
me =
> rebuilt the array, neither of the disks had perfectly checked and fixed=
=
> file system.
If the parity is not known to be correct, then in a redundant =
configuration RAIDframe will *never* use the parity bits to construct
data. In a RAID 1 configuration, this means that the mirror will =
never be read -- only the master will be read. Writes will continue =
to go to both. Reads from the mirror will only occur once it is =
known that the data for that sector is consistent across both =
components. =
> I let RAIDframe to finish rebuilding and rebooted again. This time FSCK=
came =
> up =
> clean. I found some corrupted files in the lost+found, all from /usr/pk=
gsrc. =
> So, no precious data was lost and the situation was recovered by regett=
ing =
> pkgsrc, but this left me thinking maybe the parity rebuild should be in=
itiated =
> before FSCK.
It really depends on how paranoid you want to be, and/or whether or =
not you can afford to wait for the parity check to complete. The =
fsck could be done at the same time the check is being done, but that =
really just slows both down... =
Later...
Greg Oster