NetBSD-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
RAID reconstruction hangs the whole system
Recently I've had to twice try to reconstruct a RAID set. Both of
these times lead to a system hang needing a hard reset.
I wrote a whole email about my problem. Then it occurred to me that it
might be just a stupid user error, so let's get that out of the way
first. From raidctl(8) manual page:
"Note as well that RAID 1 sets are currently limited to only 2
components."
Is this still valid (NetBSD 6.1.2)? What would this actually mean?
Should raidctl barf when trying to create a mirrored set with more than
two components? Or would you get problems like I just did? Because
the RAID 1 set whose reconstruction causes a system hang was created
with three components.
Below is the original email I was going to send describing the problem
in a bit more detail, in the case that this is not just a PEBKAC.
--8<--8<--8<--
The first time was for trying to add an initially missing component to
a set. Couldn't get it to work, so I just worked around doing the full
set from the beginning. This works fine.
After the set was created I managed to wiggle the connector loose from
one of the disks and tried to boot with that, getting a failed
component. Trying to reconstruct the set I got a system hang again.
So I would do:
# raidctl -a comp dev
# raidctl -F absent
or
# raidctl -R comp dev
to start reconstructing the set. Then a minute or two afterwards
the system hangs. If I look at the output of
# raidctl -S dev
I see
1% |* | ETA 00.00 /
with the "i'm doing something" indicator scrolling right up until
the system hangs. The ETA never updates from 00.00. Also
# iostat -x -w5
shows almost non-existant disk activity and CPU usage is low too.
This is a two core amd64 with NetBSD 6.1.2. The problematic set is
a small three component RAID 1 set for booting. Rest of those disks
are used for cgds that form a RAID 5. Then there are three other
disks which form yet another cgd + RAID 5.
Now the question is, how do I go about getting some more information
about this? I'm not happy to send just this "it doesn't work"
as a bug report.
--8<--8<--8<--
--
Jarmo Jaakkola
Home |
Main Index |
Thread Index |
Old Index