Subject: port-sun3/2727: Writing to tape crashes system with 'done < 0; strategy broken' msg
To: None <gnats-bugs@NetBSD.ORG>
From: None <jari@pilvi.fi>
List: netbsd-bugs
Date: 09/01/1996 20:27:12
>Number: 2727
>Category: port-sun3
>Synopsis: Writing to SCSI tape crashes system with 'done < 0; strategy broken message
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: gnats-admin (GNATS administrator)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Sep 1 13:35:01 1996
>Last-Modified:
>Originator: Jari Kokko
>Organization:
>Release: 1.2_BETA
>Environment:
System: NetBSD pilvi 1.2_BETA NetBSD 1.2_BETA (PILVI) #12: Sun Aug 25 21:52:36 EET DST 1996 root@pilvi:/usr/src/sys/arch/sun3/compile/PILVI sun3
Last sup of entire sources was done on appr. Aug 24, 1996.
>Description:
Writing to tape drive, with dump, tar, etc, I very frequently
get the system to crash. Symptoms are plain: the systems drops
to the kernel debugger, printing:
Sep 1 19:27:01 pilvi /netbsd: st0(si0:4:0): soft error (corrected), data = 00 00 00
Sep 1 19:27:01 pilvi /netbsd: panic: done < 0; strategy broken
I did a savecore, and:
pilvi crash # ps -l -M netbsd.1.core
UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND
0 282 0 0 10 0 448 0 wait Is kd 0:01.01 (sh)
0 289 0 6 10 0 392 0 wait I+ kd 0:01.44 (dump)
0 290 0 1 2 0 400 0 netio S+ kd 0:00.99 (dump)
0 291 0 3921 -5 0 392 0 - R+ kd 0:01.68 (dump)
0 292 0 7 18 0 392 0 pause S+ kd 0:01.65 (dump)
0 293 0 6 18 0 392 0 pause S+ kd 0:01.43 (dump)
My system is a Sun3/60M. It has 24MB memory, the big mono
monitor and, perhaps most relevantly, two Sun hatbox 'mass
storage units', both with Micropolis 1558 disks, and two other
disks (a 200MB Seagate and a 310MB one the name of which I am
not sure of (CDC?). These two disks are a recent addition and
the tape problem existed before it, so I don't think they
matter. The tape drive is in the first hatbox, and I think it
is an Archive, I am sure it is a QIC-24 60MB drive, and it was
shipped by Sun with the 3/60. The tape drive works (worked)
fine under SunOS 4.1.1.
>How-To-Repeat:
For instance:
shutdown to single user (optional)
dump 0ucf /dev/rst0 /dev/sd0a and change tapes
dump 0ucf /dev/rst0 /dev/sd0f and change tapes
etc.
I don't ever get a successful dump of both root and /var done
before I hit the problem.
>Fix:
No idea. I have looked in the sources
(/src/sys/kern/kern_physio.c) and I can see that not defining
DIAGNOSTIC would prevent the panic :-)
>Audit-Trail:
>Unformatted: