NetBSD-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: RAID reconstruction hangs the whole system
On Thu, Dec 05, 2013 at 04:32:55PM -0600, Greg Oster wrote:
> On Thu, 5 Dec 2013 23:30:06 +0200
> Jarmo Jaakkola <netbsd-users%roskakori.fi@localhost> wrote:
> > On Thu, Dec 05, 2013 at 02:04:09PM -0600, Greg Oster wrote:
> > > On Thu, 5 Dec 2013 21:37:53 +0200
> > > Jarmo Jaakkola <netbsd-users%roskakori.fi@localhost> wrote:
> If you check your disklabels and such, I think what you'll see is the
> size of your RAID 1 set is actually about 2x what it should be....
Actually it is the size I was expecting: size of single component -
2*64 sectors. I hope I would have noticed if the size would have been
larger than I expected ;D
# for i in 1 3 4; do disklabel wd${i} | grep ' a:'; done
a: 131072 2048 RAID
a: 131072 2048 RAID
a: 131072 2048 RAID
# disklabel raid0 | grep 'total sectors'
total sectors: 130944
> > The RAID set configured and seems to work just fine except for
> > the reconstruction problems.
>
> I'm surprised it did, as it's technically missing the '4th component'
> that it would need to work properly.... Essentially your wd4a is not
> mirrored anywhere, and doesn't have anywhere to rebuild to -- and it
> might be this later fact that is causing things to hiccup when you try
> to rebuild.....
I wonder if this also caused the other problem I remember having: I was
not able to remove a hot spare once it was added. I.e. "raidctl -r" did
nothing. I'll try to confirm that one way or another when I start fixing
this muck up.
> Havn't looked at all the relevant code, but at least
> the RAID 1 config bits in the kernel don't seem to check to make sure
> there are an even number of components provided, and I'm betting
> raidctl doesn't do so either :( (So ya it 'worked', but wasn't really
> providing the RAID 1 like you were expecting :( At a minimum I should
> have added more error checking to enforce the even number of components
> requirement...
I'll change to using only two components then. I was going to submit
a PR for this as a TODO for you (I've gotten the idea that you work on
RAIDframe). I found that this issue has been reported before as #45162:
http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=45162
I'd say the severity could be raised from "non-critical". :)
Thank you very much for your help!
--
Jarmo Jaakkola
Home |
Main Index |
Thread Index |
Old Index