Re: Device timeout reading fsbn ...

To: netbsd-users%netbsd.org@localhost
Subject: Re: Device timeout reading fsbn ...
From: "Thomas Mueller" <mueller6725%twc.com@localhost>
Date: Wed, 02 Oct 2019 03:03:49 +0000

from Michael van Elst:

> mueller6725%twc.com@localhost ("Thomas Mueller") writes:
        
> >Do you know when (what version) NCQ was introduced to NetBSD?  Was it before or after 7.99.1?

> It's only in HEAD and will be in netbsd-9.
        
        
> >What is atatctl?  "which atatctl" shows nothing.  Is atatctl part of smartmontools?

> Sorry, atactl, it is a native command. E.g.
        
> # atactl wd0 smart status
> SMART supported, SMART enabled
> id value thresh crit collect reliability description                 raw
(snip)

Now I see why I could trust my old 7.99.1 installation to act as server when I was updating a NetBSD installation by NFS from the other computer.

Thanks for the information!

I looked through src/doc/CHANGES.prev on HEAD but couldn't find where NCQ was introduced.

I ran "atactl wd1 smart status" but couldn't find anything wrong from that display of information.

I suppose I should run smartmontools from a different hard drive or other drive such as a USB stick?

from Mike Pumford:

> On 01/10/2019 14:36, Thomas Mueller wrote:
        
> > Do you know when (what version) NCQ was introduced to NetBSD?  Was it before or after 7.99.1?

> It went in after NetBSD 8.x was branched so I'd guess it would be somewhere in
> the 8.99.xx versions. It is in the 9.0_BETA branch as well.
        
> > What is atatctl?  "which atatctl" shows nothing.  Is atatctl part of smartmontools?

> > I don't have smartmontools installed but could run it from the System Rescue CD or build in NetBSD (or FreeBSD or Linux?) on the Hitachi hard drive.

> > Firmware or driver bug could explain why the Western Digital Green hard drive might be adversely affected but not all other hard drives.

> > I believe Western Digital discontinued the Green hard drives because of technical or performance problems.

> The fact that the drives were deliberately designed to spin themselves down
> behind the back of the operating system and ATA driver meaning that the next
> time the OS tried to do an IO the operation would timeout and have to be
> retried after the disk had spun back up. This tended to trigger the type of
> fsbn errors you are seeing.
        
> All the extra spin up/spin down cycles played havoc with performance and I
> think also took its toll on the drive electronics. The whole idea was fairly
> flawed as spinning up a drive uses more power than at any other time in drive
> operation so doing it more often costs power unless you are confident that the
> drive can be down long enough to offset that usage.
        
> I thought they did finally produce a version of the firmware where you could
> at least turn that ridiculous behaviour off but I've no idea where you can
> find it. The other way to avoid it is to ensure the OS does a disk operation
> often enough to inhibit the spindown.

I haven't noticed the crash with FreeBSD, but FreeBSD has other problems, could be either the hard drive or motherboard.

But if I want to go further with NetBSD, I guess I need to run

sysctl -w hw.wd1.use_ncq=0

and see if this solves the problem.

But I still need to be aware of the possibility of this hard drive going fully bad.

Tom

References:
- Device timeout reading fsbn ...
  - From: Thomas Mueller
- Re: Device timeout reading fsbn ...
  - From: Michael van Elst
- Re: Device timeout reading fsbn ...
  - From: Thomas Mueller
- Re: Device timeout reading fsbn ...
  - From: Thomas Mueller
- Re: Device timeout reading fsbn ...
  - From: Michael van Elst

Prev by Date: Re: Device timeout reading fsbn ...
Next by Date: NVMM and NetBSD 9.0
Previous by Thread: Re: Device timeout reading fsbn ...
Next by Thread: Re: Device timeout reading fsbn ...
Indexes:

Home | Main Index | Thread Index | Old Index