Edgar Fuß <ef%math.uni-bonn.de@localhost> writes: > Suppose I have a level 1 RAID with components A and B; B failed. I add > C as a hot spare (raidctl -a C) and reconstruct on it (raidctl -F B) > now I have A "optimal", B "spared" and C "used_spare". > Then I find that B's failure must have been a glitch; do I raidctl -B > B or raidctl -B C? > I suppose that after the copyback, I'll have A and B "optimal" and C > "spare", right? > What if, during the reconstruction, I get an I/O error on B. I hope > the reconstruction will simply stop and leave me with A "optimal", B > "spared" and C "used_spare", right? I have not actually encountered this situation, but from reconstructing raid1 many times, I find that when I have replaced a disk I end up with wd0 used_spare wd1 optimal after doing -R onto wd0, and it takes a reboot to show optimal/optimal. When a disk encounters an error in a RAID1 set and is kicked out, I almost always try reconstructing onto it. But, I also tend to want to replace it anyway, if it was a real media failure vs a bus failure. In this case it directly goes back to optimal/optimal. I am confident that an I/O error duing reconstruction will result in the reconstruction failing. It should be possible (and would be appreciated) to construct tests cases and contribute them, using synthetic failures. This will likely require some new machinery in rump/disks to return write errors when none actually exist, unless that's been written already.
Attachment:
signature.asc
Description: PGP signature