Subject: Re: kern/35008: viaide.c v1.35 sometimes fails horribly
To: None <gnats-bugs@NetBSD.org>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: netbsd-bugs
Date: 11/13/2006 00:27:25
--9amGYk9869ThD9tj
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
On Tue, Nov 07, 2006 at 02:40:00PM +0000, perry@piermont.com wrote:
> I'm running on an amd64 box with a viaide SATA controller. With ACPI
> not on in the kernel, both version 1.34 and version 1.35 of viaide.c
> lead to periodic failures to boot (perhaps one in every five times),
> with the driver spewing errors during boot and failing to read the
> disk.
>
> However, this PR is about the behavior with ACPI turned on.
>
> Version 1.35 leads to failure about one in every five to ten reboots.
> I get lots of messages, most of which scroll off the screen,
> preventing me from writing them down. :(
>
> This is what was left on the screen that I could type in by hand:
>
> [...]
> : <ST506>
> wd0: drive supports 1-sector PIO transfers, chs addressing
> [note: this is a modern drive and does fine most reboots.]
> wd0: 69632 KB, 1024 cyl, 8 head, 17 sec, 512 bytes/sect x 139264 sectors
> [that's totally wrong of course, and it works on most boots.]
> [then we have a bunch of unimportant junk, and then...]
> wd0(viaide1:0:0): using PIO mode 0
> viaide1:0:0: wait timed out
> wd0d: device timeout reading fsbn 0 (wd0 bn 0; cn 0 tn 0 sn 0), retrying
> wd0: soft error (corrected)
> wd0: mbr partition exceeds disk size
> wd0: mbr partition exceeds disk size
> wd0: mbr partition exceeds disk size
> wd0: mbr partition exceeds disk size
> boot device: <unknown>
> root device:
OK, it looks like the drive rejected the IDENTIFY command. Maybe is
needs a reset (currently the sata probe resets only the interface, not
the drive itself, while the old probe resets the drives). Could you try
the attached patch ?
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
--9amGYk9869ThD9tj
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=diff
Index: wdc.c
===================================================================
RCS file: /cvsroot/src/sys/dev/ic/wdc.c,v
retrieving revision 1.240
diff -u -u -r1.240 wdc.c
--- wdc.c 25 Oct 2006 20:14:00 -0000 1.240
+++ wdc.c 12 Nov 2006 23:24:14 -0000
@@ -238,7 +238,13 @@
bus_space_write_4(wdr->sata_iot, wdr->sata_control, 0, scontrol);
tsleep(wdr, PRIBIO, "sataup", mstohz(50));
- sstatus = bus_space_read_4(wdr->sata_iot, wdr->sata_status, 0);
+ /* wait up to 1s for device to come up */
+ for (i = 0; i < 100; i++) {
+ sstatus = bus_space_read_4(wdr->sata_iot, wdr->sata_status, 0);
+ if ((sstatus & SStatus_DET_mask) == SStatus_DET_DEV)
+ break;
+ tsleep(wdr, PRIBIO, "sataup", mstohz(10));
+ }
switch (sstatus & SStatus_DET_mask) {
case SStatus_DET_NODEV:
@@ -286,6 +292,12 @@
aprint_normal("%s: port %d: device present, speed: %s\n",
chp->ch_atac->atac_dev.dv_xname, chp->ch_channel,
sata_speed(sstatus));
+ /*
+ * issue a reset in case only the interface part of the drive
+ * is up
+ */
+ if (wdcreset(chp, RESET_SLEEP) != 0)
+ chp->ch_drive[0].drive_flags = 0;
break;
default:
--9amGYk9869ThD9tj--