Subject: port-i386/10060: Error on ncr driver
To: None <gnats-bugs@gnats.netbsd.org>
From: None <kivinen@ssh.fi>
List: netbsd-bugs
Date: 05/06/2000 21:03:14
>Number: 10060
>Category: port-i386
>Synopsis: When doing some scsi commands the NCR scsi driver hangs
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: port-i386-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat May 06 21:04:00 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator: Tero Kivinen
>Release: NetBSD-current 2000-04-24
>Organization:
SSH Communications Security
>Environment:
System: NetBSD kahva.ssh.fi 1.4X NetBSD 1.4X (KAHVA) #0: Thu Apr 27 09:28:18 EEST 2000 ztk@kahva.ssh.fi:/usr/src/sys/arch/i386/compile/KAHVA i386
ncr0 at pci0 dev 15 function 0: ncr 53c875 fast20 wide scsi
ncr0: interrupting at irq 5
ncr0: minsync=12, maxsync=137, maxoffs=16, 128 dwords burst, large dma fifo
ncr0: single-ended, open drain IRQ driver, using on-chip SRAM
ncr0: restart (scsi reset).
scsibus0 at ncr0: 16 targets, 8 luns per target
...
scsibus0: waiting 2 seconds for devices to settle...
ss0 at scsibus0 target 4 lun 0: <Nikon, LS-2000, 1.31> SCSI2 6/scanner removable
cd1 at scsibus0 target 6 lun 0: <TEAC, CD-R56S, 1.0E> SCSI2 5/cdrom removable
probe(ncr0:6:1): 10.0 MB/s (100 ns, offset 15)
...
>Description:
I am writing linux sane (Scanner Access Now Easy) emulation
driver for NetBSD so I can run the linux version of vuescan
(http://www.hamrick.com/) in my machine. The driver is ready,
but when the vuescan starts it does some scsi commands and
after few the ncr driver returns error and after that the ncr
driver will return error for all commands for that device.
This seems to be ncr driver bug and powering off and on the
scanner itself doesn't help.
I created a small test program that does the same scsi
commands using netbsd native SCIOCCOMMAND ioctl and it causes
the same effect on the ncr driver, so the bug cannot be in my
sane emulation code.
The bug is repeatable with the test program when run on the
Nikon LS-2000 scanner, but if I run the same test program on
the TEAC CD-R56S device it seems to work. This might also be
problem in the Nikon LS-2000 scanner device.
Here is the output of the test program when run on the
/dev/ss0 device (Nikon LS-2000 scanner):
-----------------------------------------------------------------
kahva (9:29) ~/sanei-linux-compat#gcc test.c
kahva (9:29) ~/sanei-linux-compat#./a.out
Request:
00000000: 1200 0000 2400 ....$.
Reply:
00000000: 0680 0202 1f00 0000 4e69 6b6f 6e20 2020 ........Nikon
00000010: 4c53 2d32 3030 3020 2020 2020 2020 2020 LS-2000
00000020: 312e 3331 1.31
Request:
00000000: 1201 0000 0400 ......
Reply:
00000000: 0600 0010 ....
Request:
00000000: 1201 0000 1400 ......
Reply:
00000000: 0600 0010 0001 4041 5051 5253 5460 61c1 ......@APQRST`a.
00000010: d1e1 f0f8 ....
Request:
00000000: 1201 0000 0400 ......
Reply:
00000000: 0600 0010 ....
Request:
00000000: 1201 0000 1400 ......
Reply:
00000000: 0600 0010 0001 4041 5051 5253 5460 61c1 ......@APQRST`a.
00000010: d1e1 f0f8 ....
Request:
00000000: 1201 0100 0400 ......
Reply:
00000000: 0601 0007 ....
Request:
00000000: 1201 0100 0b00 ......
Reply:
00000000: 0601 0007 064d 6f75 6e74 00 .....Mount.
Request:
00000000: 1201 4000 0400 ..@...
ncr0:4: ERROR (a0:0) (6-a7-7) (e0/5) @ (mem a51001b4:a51001b4).
ncr0: regdump: da 10 80 05 47 e0 04 0f 01 06 00 a7 80 00 0f 00.
ncr0: restart (fatal error).
ss0(ncr0:4:0): COMMAND FAILED (9 ff) @0xc09ee000.
Scsi command 7 failed, retsts = 1
Reply:
00000000: 0000 0000 ....
Request:
00000000: 1201 4000 1000 ..@...
ncr0: timeout ccb=0xc09ee000 (skip)
^C
kahva (9:29) ~/sanei-linux-compat#
-----------------------------------------------------------------
After pressing last "ncr0: timeout ..." line the program hung,
and pressing Ctrl-C does nothing. After I turn off the scanner
for few seconds the program continues and exits because of the
Ctrl-C. If I run program to the /dev/rcd1d device the output
is like this:
-----------------------------------------------------------------
kahva (9:41) ~/sanei-linux-compat#./a.out /dev/rcd1d
cd1(ncr0:6:0): 10.0 MB/s (100 ns, offset 15)
Request:
00000000: 1200 0000 2400 ....$.
Reply:
00000000: 0580 0202 1f00 0098 5445 4143 2020 2020 ........TEAC
00000010: 4344 2d52 3536 5320 2020 2020 2020 2020 CD-R56S
00000020: 312e 3045 1.0E
Request:
00000000: 1201 0000 0400 ......
Reply:
00000000: 0500 0002 ....
Request:
00000000: 1201 0000 1400 ......
Reply:
00000000: 0500 0002 0080 0000 0000 0000 0000 0000 ................
00000010: 0000 0000 ....
Request:
00000000: 1201 0000 0400 ......
Reply:
00000000: 0500 0002 ....
Request:
00000000: 1201 0000 1400 ......
Reply:
00000000: 0500 0002 0080 0000 0000 0000 0000 0000 ................
00000010: 0000 0000 ....
Request:
00000000: 1201 0100 0400 ......
Scsi command 5 failed, retsts = 3
Reply:
00000000: 0000 0000 ....
Request:
00000000: 1201 0100 0b00 ......
Scsi command 6 failed, retsts = 3
Reply:
00000000: 0000 0000 0000 0000 0000 00 ...........
Request:
00000000: 1201 4000 0400 ..@...
Scsi command 7 failed, retsts = 3
Reply:
00000000: 0000 0000 ....
Request:
00000000: 1201 4000 1000 ..@...
Scsi command 8 failed, retsts = 3
Reply:
00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................
zsh: 460 exit 9 ./a.out /dev/rcd1d
-----------------------------------------------------------------
If I rerun the program to /dev/ss0 device it fails
immediately:
-----------------------------------------------------------------
kahva (9:41) ~/sanei-linux-compat#./a.out /dev/ss0
ncr0:4: ERROR (81:0) (6-a7-7) (0/5) @ (script 1bc:a5094800).
ncr0: script cmd = 900b0000
ncr0: regdump: da 10 80 05 47 00 04 0f 01 06 83 a7 80 00 0f 00.
ncr0: restart (fatal error).
ss0(ncr0:4:0): COMMAND FAILED (9 ff) @0xc09ee000.
^C^C
kahva (9:41) ~/sanei-linux-compat#
-----------------------------------------------------------------
And I had to turn off the scanner again to recover. I also
tried to rerun the program to the /dev/rcd1d device again, and
now it hangs also:
-----------------------------------------------------------------
kahva (9:49) ~/sanei-linux-compat#./a.out /dev/rcd1d
Request:
00000000: 1200 0000 2400 ....$.
^C^Cncr0: timeout ccb=0xc09f8000 (skip)
^C^C
-----------------------------------------------------------------
And because I cannot turn off the CD-Rom I cannot recover
anymore (turning off the scanner doesn't help).
Because the problem is completely repeatable, I can rerun the
tests with more debugging, just give out information what kind
of debugging information would be useful and how to enable
them.
>How-To-Repeat:
Compile this code and run it on some scsi devices:
----------------------------------------------------------------------
#include <sys/param.h>
#include <sys/ioctl.h>
#include <sys/scsiio.h>
#include <err.h>
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <util.h>
#include <dev/scsipi/scsipi_all.h>
#include <dev/scsipi/scsi_all.h>
#include <dev/scsipi/scsi_disk.h>
#include <dev/scsipi/scsipiconf.h>
typedef struct {
char request[6];
int reply_len;
} scsi_requests;
scsi_requests requests[] = {
{ { 0x12, 0x00, 0x00, 0x00, 0x24, 0x00 }, 36 },
{ { 0x12, 0x01, 0x00, 0x00, 0x04, 0x00 }, 4 },
{ { 0x12, 0x01, 0x00, 0x00, 0x14, 0x00 }, 20 },
{ { 0x12, 0x01, 0x00, 0x00, 0x04, 0x00 }, 4 },
{ { 0x12, 0x01, 0x00, 0x00, 0x14, 0x00 }, 20 },
{ { 0x12, 0x01, 0x01, 0x00, 0x04, 0x00 }, 4 },
{ { 0x12, 0x01, 0x01, 0x00, 0x0b, 0x00 }, 11 },
{ { 0x12, 0x01, 0x40, 0x00, 0x04, 0x00 }, 4 },
{ { 0x12, 0x01, 0x40, 0x00, 0x10, 0x00 }, 16 }
};
int num_requests = sizeof(requests) / sizeof(requests[0]);
void print_buffer(unsigned char *buffer, size_t len)
{
size_t i, j;
for(i = 0; i < len; i += 16)
{
printf("%08lx: ", (unsigned long) i);
for(j = 0; j < 16; j++)
{
if (i + j >= len)
printf(" ");
else
printf("%02x", buffer[i + j]);
if (j % 2 == 1)
printf(" ");
}
printf(" ");
for(j = 0; j < 16; j++)
{
if (i + j >= len)
printf(" ");
else if (buffer[i + j] >= ' ' &&
buffer[i + j] <= '~')
printf("%c", buffer[i + j]);
else
printf(".");
}
printf("\n");
}
}
int main(int argc, char **argv)
{
char inqbuf[64];
scsireq_t req;
int fd, i;
if (argc < 2)
fd = open("/dev/ss0", O_RDWR, 0666);
else
fd = open(argv[1], O_RDWR, 0666);
if (fd < 0)
{
perror("Opening device");
exit(1);
}
for(i = 0; i < num_requests; i++)
{
memset(inqbuf, 0, sizeof(inqbuf));
memset(&req, 0, sizeof(req));
memcpy(req.cmd, requests[i].request, 6);
printf("Request:\n");
print_buffer(req.cmd, 6);
req.cmdlen = 6;
req.databuf = inqbuf;
req.datalen = requests[i].reply_len;
req.timeout = 5000;
req.flags = SCCMD_READ;
req.senselen = SENSEBUFLEN;
if (ioctl(fd, SCIOCCOMMAND, &req) == -1)
{
perror("Ioctl SCIOCCOMMAND failed");
exit(1);
}
if (req.retsts != SCCMD_OK)
{
printf("Scsi command %d failed, retsts = %d\n", i, req.retsts);
}
printf("Reply:\n");
print_buffer(inqbuf, requests[i].reply_len);
}
}
----------------------------------------------------------------------
>Fix:
None known.
>Release-Note:
>Audit-Trail:
>Unformatted: