Subject: Re: pciide lost interrupt - losing access to the file system
To: Manuel Bouyer <bouyer@antioche.lip6.fr>
From: Johan Ihren <johani@pdc.kth.se>
List: current-users
Date: 10/15/1999 09:31:38
Manuel Bouyer <bouyer@antioche.lip6.fr> writes:
> On Wed, Oct 13, 1999 at 10:04:59PM -0400, Rick Byers wrote:
> > Hi,
> >
> > I'm having a problem with NetBSD-current/i386 (a few days old). I wasn't
> > having this problem a few months ago (sorry I can't be more specific).
> > I've got two harddrives, wd0 is my windoze drive, wd1 is my NetBSD drive.
> > When playing mp3s (music files) from my NetBSD drive I get the following
> > kernel messages every minute or so (and access to the disk is suspended
> > for a few seconds):
> >
> > pciide0:0:1: lost interrupt
> > type: ata
> > c_bcount: 8192
> > c_skip: 0
> > pciide0:0:1: Bus-Master DMA error: missing interrupt, status=0x61
> > wd1e: DMA error writing fsbn 1517072 of 1517072-1517087 (wd1 bn 1694480; cn 1681 tn 0 sn 32), retrying
> > wd1: soft error (corrected)
> >
> > When playing mp3s from my Win95 drive, I get the same sort of message, but
> > instead of "wd1: soft error (corrected)", I get "pciide0:0:0: missing
> > untimeout" and the system never recovers - all access to either disk just
> > blocks forever. I can break into the debugger and do a "reboot", but I
> > get "syncing disks ... panic: lockmgr: no context".
>
> This is interesting, it shouldn't happens. I can't look at this rigth now but
> I'll try in the next few days. As you can reproduce the problem, maybe you'll
> can test some patches for me ?
> Oh, and please send-pr this so it doesn't get lost.
I'm seeing the same thing, except in my case it is while trying to
install 1.4 onto an old 486-machine (no PCI). I.e. what I see is
basically:
wdc0:0:0: device timeout
type: ata
c_bcount: 0
c_skip: 0
over and over when booting from the install floppy. First I thought it
must be broken h/w, but since other OSes (not to mention by name)
install nicely it seems like a NetBSD problem.
Please note that since I'm installing stock 1.4 (straight from the
Usenix-CD) it does not appear to be only a recent problem.
Regards,
Johan Ihrén, <johani@pdc.kth.se>,
phone: +46 (8) 790 6844, Center for Parallel Computers,
Royal Institute of Technology, SE-100 44 Stockholm, Sweden