Port-sparc64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: SCSi issues after 10.0 update



Hi,

I stressed the system for days...
I have it serving over NFS the same stuff as before.
In the meanwhile I had it compile for 33 hours something quite heavy, lots of filesystem as well as swapping.
Up to now, no issue.

I would exclude a hard disk issue (do these older parallel SCSI drives have something like SMART?).
So more a kernel issue... of course, pretty wild to guess what and where?

Riccardo

Riccardo Mottola wrote:
Hi,

Today, my freshly updated Netra T1 which serves NFS to all my sparcs... (including pkgsrc and package binaries) started to be not responsive.

I attach serial console and find myself in the debugger, stacktrace looks like in an drive,  I type continue and see a stream of this:


[ 152678.9197479] sd1a: error writing fsbn 15551936 of 15551936-15551967 (sd1 bn 15551936; cn 4330 tn 21 sn 113) [ 152679.0383219] sd1a: error writing fsbn 15552032 of 15552032-15552063 (sd1 bn 15552032; cn 4330 tn 22 sn 76) [ 152679.1558558] sd1a: error writing fsbn 15552192 of 15552192-15552223 (sd1 bn 15552192; cn 4330 tn 23 sn 103) [ 152679.2744306] sd1a: error writing fsbn 15551840 of 15551840-15551871 (sd1 bn 15551840; cn 4330 tn 21 sn 17) [ 152679.3919657] sd1a: error writing fsbn 15551968 of 15551968-15551999 (sd1 bn 15551968; cn 4330 tn 22 sn 12) [ 152679.5094991] sd1a: error writing fsbn 15552096 of 15552096-15552127 (sd1 bn 15552096; cn 4330 tn 23 sn 7) [ 152679.6259930] sd1a: error writing fsbn 15552224 of 15552224-15552255 (sd1 bn 15552224; cn 4330 tn 24 sn 2) [ 152679.7445680] sd1a: error writing fsbn 15551872 of 15551872-15551903 (sd1 bn 15551872; cn 4330 tn 21 sn 49) [ 152679.8600212] sd1a: error writing fsbn 15552000 of 15552000-15552031 (sd1 bn 15552000; cn 4330 tn 22 sn 44) [ 152679.9775560] sd1a: error writing fsbn 15552128 of 15552128-15552159 (sd1 bn 15552128; cn 4330 tn 23 sn 39) [ 152680.0950897] sd1a: error writing fsbn 15552256 of 15552256-15552287 (sd1 bn 15552256; cn 4330 tn 24 sn 34) [ 152680.2147047] sd1a: error writing fsbn 15551904 of 15551904-15551935 (sd1 bn 15551904; cn 4330 tn 21 sn 81) [ 152680.3301591] sd1a: error writing fsbn 15552064 of 15552064-15552095 (sd1 bn 15552064; cn 4330 tn 22 sn 108) [ 152680.4487334] sd1a: error writing fsbn 15552160 of 15552160-15552191 (sd1 bn 15552160; cn 4330 tn 23 sn 71) [ 152680.5662672] sd1a: error writing fsbn 15552320 of 15552320-15552351 (sd1 bn 15552320; cn 4330 tn 24 sn 98) [ 152680.6858827] sd1: sync (50.00ns offset 31), 16-bit (40.000MB/s) transfers, tagged queueing [ 152680.9942874] esiop0: selection timeout without command, target -1 (sdid 0x4), slot 10 [ 152681.0884124] esiop0: unhandled scsi interrupt, sist=0x400 sstat1=0xf DSA=0xc034f130 DSP=0x3d0
[ 152681.1924234] esiop0: scsi bus reset
[ 152681.2361087] sd1(esiop0:0:1:0): command with tag id 0 reset
[ 152681.3047562] sd1(esiop0:0:1:0): command with tag id 1 reset
[ 152681.3734043] sd1(esiop0:0:1:0): command with tag id 2 reset
[ 152681.4420526] sd1(esiop0:0:1:0): command with tag id 3 reset
[ 152681.5107008] sd1(esiop0:0:1:0): command with tag id 4 reset
[ 152681.5793498] sd1(esiop0:0:1:0): command with tag id 5 reset
[ 152681.6479976] sd1(esiop0:0:1:0): command with tag id 6 reset
[ 152681.7166465] sd1: async, 8-bit transfers, tagged queueing
[ 152681.7842901] sd1: sync (50.00ns offset 31), 16-bit (40.000MB/s) transfers, tagged queueing [ 152681.8869196] esiop0: DMA IRQ: Illegal instruction DMA fifo empty, DSP=0x4f0 DSA=0xc034e790:  current T/L/Qd
[ 152682.0127705] esiop0: scsi bus reset
[ 152682.0564558] sd1(esiop0:0:1:0): command
CTRL-A Z for help | 9600 8N1 | NOR | Minicom 2.8 | VT102 | Offline | tty00

of course, haard disks can go havoc... the Netra has two hard disks, sd1 is the one which serves files. Or could it be that NetBSD 10 isn't that stable? Network and Disk I/O caused trouble?

I quickly power-cycled. System rebooted, fsck'd volumes mounted and NFS worked, no more SCSI errors. NFS resumed and worked until I shout down everything for Friday evening.

Will retry next week... but looks like esiop0 did hang up on sd1!

Riccardo




Home | Main Index | Thread Index | Old Index