Subject: Re: bootable RAID-1 array problems
To: Ray Phillips <r.phillips@jkmrc.com>
From: Greg Oster <oster@cs.usask.ca>
List: port-alpha
Date: 08/19/2004 22:20:18
Ray Phillips writes:
> Recently I've tried unsuccessfully to setup a RAID1 array following
> the instructions at http://www.netbsd.org/guide/en/chap-rf.html.
> First I used a -current system built from CVS sources updated on 26
> July, then again with one built from an update done on 17 August.
>
> I completed this successfully on an i386 machine (using the 26 July
> sources) which worked, so I wonder if there's an alpha-specific
> problem or (more likely) if I've done something wrong?
Nope.... This:
> The
> console output for the SCSI pair at the point of the crash was:
>
> RECON: initiating reconstruction on col 0 -> spare at col 2
> sd1(isp0:0:2:0): Check Condition on CDB: 0x08 00 10 40 80 00
> SENSE KEY: Hardware Error
> ASC/ASCQ: ASC 0x44 ASCQ 0x9d
>
> raid0: IO Error. Marking /dev/sd1a as failed.
> raid0: Recon read failed!
and:
> and for the IDE pair:
>
> Aug 19 17:11:41 www /netbsd: stray isa irq 14
> Warning: truncating spare disk /dev/wd0a to 4127616 blocks
> Aug 19 17:12:50 www su: ray to root on /dev/ttyp1
> RECON: initiating reconstruction on col 0 -> spare at col 2
> wd1a: error reading fsbn 1031488 of 1031488-1031615 (wd1 bn 1031488;
> cn 1023 tng
> wd1: (uncorrectable data error)
[snip]
>
> raid0: IO Error. Marking /dev/wd1a as failed.
> raid0: Recon read failed!
indicate hardware errors, and right now the reconstruction code
in RAIDframe doesn't deal at all with those sorts of errors.
> I suppose I was asking for trouble in the second case since wd1 has ~
> 63 K of bad sectors, but I'm pretty sure they were in the swap
> patition so I thought they wouldn't be relevant. I've no reason to
> think there was a hardware problem with the SCSI setup.
[snip]
I can't tell from the error, but is it possible you fell off the end
of the SCSI disk while doing the reconstruct? Short of a real error,
that's the only other thing I can think of right now. (You should have
seen the error with doing the '-i' initialization too, if that was
the case. Was the parity of the sets "clean" when you started the
reconstruct? )
Later...
Greg Oster