Subject: Re: RAID kills the machine
To: Greg Oster <oster@cs.usask.ca>
From: Robert Elz <kre@munnari.OZ.AU>
List: current-users
Date: 08/26/2002 23:57:11
Date: Fri, 23 Aug 2002 08:39:27 -0600
From: Greg Oster <oster@cs.usask.ca>
Message-ID: <20020823143927.6C37255C02@cs.usask.ca>
[Tobias Schuepp <netbsd@schuepp.net>]
| > I can reproduce that. Does it belong to my disks or is it a bug in
| > raidframe?
Neither of those, it is the ahc driver.
I sent in a PR (kern/11180) in October 2000 about this, and Greg Woods
moans about it whenever he gets the chance (this mail probably being
an invitation for more) which might be why no-one can be bothered to
fix it...
There are probably other PRs as well.
Whenever multiple drives are active on the controller at the same time,
there's a possibility the controller will hang (get into an obscure state,
or something). Raid is a good way (the best way probably) to really
get the controller busy on multiple drives simultaneously.
The PR also reported a raidframe problem that you (Greg) fixed a day or
two later, so that part of the PR is irrelevant now.
| It's a problem with your disks, cables, SCSI termination,
| or something related to the SCSI bus.
Well, I guess the last of those counts, in that the driver is
"something related to the scsi bus"...
| The disks are having serious enough problems that RAIDframe
| thinks both have failed, and (currently) it just stops the kernel.
Yes, it isn't a raidframe problem (though sometime in the future, having
raid act just like a drive failure would be better - just return i/o
error). In many cases that will often cause the system to panic pretty
soon after, but it should be possible to have a raidframe where one of
the components is an underlying raidframe, and in that case, all that
should happen is that the higher one should see a component failure.
kre