Subject: Re: RAID1 bootblocks for 2.0 (and more ramblings)
To: None <port-sparc64@netbsd.org>
From: Jonathan Perkin <jonathan@perkin.org.uk>
List: port-sparc64
Date: 12/13/2004 11:22:33
* On 2004-12-12 at 18:45 GMT, Jonathan Perkin wrote:
> Hmm, still broken :/
>
> GENERIC32:
> [..]
> ## Disable UDMA 4 which causes data corruption on the Acer Labs
> ## chipset on Sun Blade 100 and Netra X1 machines.
> #wd* at atabus? drive ? flags 0x0000
> wd0 at atabus0 drive 0 flags 0xfac
> wd1 at atabus1 drive 0 flags 0xfac
>
> wd1 at atabus1 drive 0: <ST3120026A>
> wd1: drive supports 16-sector PIO transfers, LBA48 addressing
> wd1: 111 GB, 232581 cyl, 16 head, 63 sec, 512 bytes/sect x 234441648 sectors
> wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
> wd1(aceride0:1:0): using PIO mode 4, DMA mode 2 (using DMA data transfers)
> [..]
> wd1a: DMA error writing fsbn 22634336 of 22634336-22634367 (wd1 bn 22634336; cn 22454 tn 11 sn 11), retrying
> wd1: soft error (corrected)
>
> I may just get some cables for good measure anyway.
Well, I tried two cables lying around, same results. Interestingly,
however, I tried disabling one of the drives after reading about power
related issues driving both disks simultaneously as would happen in a
RAID1 configuration. Since doing that, I've not had a single problem
even after a pkgsrc checkout which was spewing lots of errors
previously. I'm hoping though that this isn't a PSU problem (not
likely to find a more powerful one) and is some kind of bus contention
with aceride(4) which can be worked around..
I also noticed the atabus changes bouyer made last year seem to have
removed the setting for disabling UDMA4; was this intentional? I was
wondering why the comment in GENERIC32 didn't make any sense.
-> cvs diff -r1.70 -r1.71 GENERIC32
[..]
## Disable UDMA 4 which causes data corruption on the Acer Labs
## chipset on Sun Blade 100 and Netra X1 machines.
-wd* at pciide? channel ? drive ? flags 0x0a00 # Disable UDMA 4
+wd* at atabus? drive ? flags 0x0000
Either we need to have the 0x0a00 setting back or, if the problem has
been fixed and it auto detects correctly, remove the comment.
Thanks,
--
Jonathan Perkin The NetBSD Project
http://www.perkin.org.uk/ http://www.netbsd.org/