Subject: Re: kern/14007: uncorrectable data error reading fsbn -- problems with IDE hard disk
To: None <sen@eccosys.com>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: netbsd-bugs
Date: 09/19/2001 22:29:01
On Tue, Sep 18, 2001 at 11:10:24PM -0700, sen@eccosys.com wrote:
>
> >Number: 14007
> >Category: kern
> >Synopsis: uncorrectable data error reading fsbn -- problems with IDE hard disk
> >Confidential: no
> >Severity: serious
> >Priority: high
> >Responsible: kern-bug-people
> >State: open
> >Class: sw-bug
> >Submitter-Id: net
> >Arrival-Date: Tue Sep 18 23:11:00 PDT 2001
> >Closed-Date:
> >Last-Modified:
> >Originator: Sen Nagata
> >Release: 1.5.1
> >Organization:
> >Environment:
> NetBSD 1.5.1 (GENERIC_LAPTOP) #33: Mon Jul 2 15:56:09 CEST 2001 he@nsa.uninett.no:/usr/src/sys/arch/i386/compile/GENERIC_LAPTOP i386
> >Description:
> While using 1.5.2, when trying to read some files, I heard my hard
> disk make some unhealthy sounding noises. Switching to a console, I
> see something like:
>
> wd0: transfer error, downgrading to Ultra-DMA mode 1
> wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 1 (using DMA data transfers)
> wd0e: uncorrectable data error reading fsbn 5489344 of 5489344-5489359 (wd0 bn 7894369; cn 8353 tn 12 sn 28), retrying
> ...
>
> I've also noticed that sometimes when the disk problems occur, there
> are several mode downgrade attempts -- e.g. starting from some mode ->
> Ultra-DMA mode 1 PIO mode 4 -> DMA mode 2 PIO mode 4 -> PIO mode 4.
Yes, I need to fix this. For this kind of error it's not good to downgrade.
>
> This happened to me a day or so after upgrading to 1.5.2, so I did a fresh install of 1.5.2 on to a different hard disk and a different machine, followed by transferring data (cp -pR). Bulk transfers of
> data would fail occasionally, but transferring files individually
> appeared to work (I interlaced copying with syncing the disk).
>
> Today, the same thing started to occur on the new machine so I booted
> a 1.5.1 kernel hoping the problem would go away, but no help there
> either.
>
> I've experienced this on a custom compiled kernel (uses the default
> pciide and wd settings) as well as the GENERIC_LAPTOP kernel.
>
> BTW, both hard disk are IBM hard disks and the machines I tried
> this on are ThinkPads (600E and X20).
>
> I hope someone else can reproduce this but searching the archives
> and pr forms didn't turn up anything for me.
Well, your disk is obviously dead.
Now, the problem is to find why it died. Does it get enouth power ?
Doesn't it get too hot ?
It's quite possible that windows won't push it that hard.
--
Manuel Bouyer <bouyer@antioche.eu.org>
--