Subject: kern/29540: raidframe can show clean parity on raid1 with a failed disk
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <riz@tastylime.net>
List: netbsd-bugs
Date: 02/27/2005 05:23:00
>Number: 29540
>Category: kern
>Synopsis: raidctl can show clean parity on raid1 with failed disk
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Feb 27 05:23:00 +0000 2005
>Originator: Jeff Rizzo <riz@tastylime.net>
>Release: NetBSD 2.0_STABLE
>Organization:
>Environment:
System: NetBSD guava.tastylime.net 2.0_STABLE NetBSD 2.0_STABLE (TYAN251X.MP) #0: Wed Feb 23 11:08:08 PST 2005 riz@lychee.tastylime.net:/home/riz/buildobj/usr/src/sys/arch/i386/compile/TYAN251X.MP i386
Architecture: i386
Machine: i386
>Description:
I've run into this before, but I had forgotten about it.
I (purposefully) relabelled one disk of a raid1 set, and
raidframe properly told me (on console) that the raid had failed.
However, upon bootup, the parity check shows:
/dev/rraid0d: Parity status: clean
... even though raidctl -s raid0 shows:
guava# raidctl -s raid0
Components:
component0: failed
/dev/wd0a: optimal
No spares.
component0 status is: failed. Skipping label.
Component label for /dev/wd0a:
Row: 0, Column: 1, Num Rows: 1, Num Columns: 2
Version: 2, Serial Number: 20050223, Mod Counter: 662
Clean: No, Status: 0
sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
Queue size: 100, blocksize: 512, numBlocks: 39102208
RAID Level: 1
Autoconfig: Yes
Root partition: Yes
Last configured as: raid0
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.
>How-To-Repeat:
Build a raid1 set, calculate parity. Then, relabel one disk,
removing the raid partition, and newfs it. Reboot normally, and
see that parity is 'clean'.
>Fix:
Unknown - this is probably straightforward, and one could
possibly argue that the behavior is "correct" - but it's certainly
surprising.
>Unformatted: