Subject: Re: 1.5.3 fxp scsi fun
To: David Brownlee <abs@formula1.com>
From: Wojciech Puchar <wojtek@chylonia.3miasto.net>
List: port-i386
Date: 07/13/2002 23:26:27
>
> =09We have two 1.5.3 athlon webservers which seem to be having some
> =09scsi/ethernet issues. Occasionally the machines just die.
> =09One has an siop, and the other an ahc. I believe they are both VIA
> =09chipsets. Does this ring a bell for anyone?
not sure how about new VIA designs, but when i had VIA based K6/300 system
setting off "PCI write buffer" solved the problem. And with not visible
performance degradation.
It was with both linux and NetBSD. i had Symbios logic SCSI and S3 Trio
graphics. with adaptec SCSI had same effect, with other GFX cards too.
>
> siop/fxp machine:
>
> cpu0: AMD K7 (Athlon) (686-class), 1602.14 MHz
> total memory =3D 1023 MB
> avail memory =3D 869 MB
> using 8192 buffers containing 128 MB of memory
> BIOS32 rev. 0 found at 0xfb5b0
> mainbus0 (root)
> pci0 at mainbus0 bus 0: configuration mode 1
> pci0: i/o space, memory space enabled
> pchb0 at pci0 dev 0 function 0
> pchb0: Advanced Micro Devices product 0x700e (rev. 0x13)
> ppb0 at pci0 dev 1 function 0: Advanced Micro Devices product 0x700f =
(rev. 0x00)
> pci1 at ppb0 bus 1
> pci1: i/o space, memory space enabled
> pcib0 at pci0 dev 7 function 0
> pcib0: VIA Technologies VT82C686A (Apollo KX133) PCI-ISA Bridge (rev.=
0x40)
> [...]
> siop0 at pci0 dev 15 function 0: Symbios Logic 53c895 (ultra2-wide sc=
si)
> siop0: using on-board RAM
> siop0: interrupting at irq 11
> scsibus0 at siop0: 16 targets, 8 luns per target
> fxp0 at pci0 dev 17 function 0: i82550 Ethernet, rev 12
> fxp0: interrupting at irq 10
> fxp0: Ethernet address 00:02:b3:9c:3e:ad
> inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
> inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
>
> serial console errors for siop/fxp when dying:
>
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0: WARNING: SCB timed out!
> fxp0 at line 2098: dmasync timeout
> fxp0: WARNING: SCB timed out!
> fxp0 at line 1617: dmasync timeout
> sd0(siop0:1:0): command timeout
> sd0(siop0:1:0): command timeout
> sd0(siop0:1:0): command timeout
> sd0(siop0:1:0): command timeout
>
> aha/fxp machine:
> cpu0: AMD K7 (Athlon) (686-class), 1602.14 MHz
> total memory =3D 1023 MB
> avail memory =3D 869 MB
> using 8192 buffers containing 128 MB of memory
> BIOS32 rev. 0 found at 0xfb5b0
> mainbus0 (root)
> pci0 at mainbus0 bus 0: configuration mode 1
> pci0: i/o space, memory space enabled
> pchb0 at pci0 dev 0 function 0
> pchb0: Advanced Micro Devices product 0x700e (rev. 0x13)
> ppb0 at pci0 dev 1 function 0: Advanced Micro Devices product 0x700f =
(rev. 0x00)
> pci1 at ppb0 bus 1
> pci1: i/o space, memory space enabled
> pcib0 at pci0 dev 7 function 0
> pcib0: VIA Technologies VT82C686A (Apollo KX133) PCI-ISA Bridge (rev.=
0x40)
> [...]
> ahc0 at pci0 dev 15 function 0
> ahc0: interrupting at irq 11
> ahc0: aic7892 Wide Channel A, SCSI Id=3D7, 16/255 SCBs
> scsibus0 at ahc0 channel 0: 16 targets, 8 luns per target
> fxp0 at pci0 dev 17 function 0: i82550 Ethernet, rev 12
> fxp0: interrupting at irq 10
> fxp0: Ethernet address 00:02:b3:9c:8c:34
> inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
> inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
>
> serial console errors for aha/fxp when dying:
>
> sd1(ahc0:1:0): SCB 17 - timed out while idle, SEQADDR =3D=3D 0x155
> SCSIRATE =3D=3D 0x0
> sd1(ahc0:1:0): SCB 17: Immediate reset. Flags =3D 0x4040
> sd1(ahc0:1:0): no longer in timeout, status =3D 0
> ahc0: Issued Channel A Bus Reset. 8 SCBs aborted
> ahc0: target 0 using 16bit transfers
> ahc0: target 0 synchronous at 40.0MHz, offset =3D 0x7f
> ahc0: target 0 using 16bit transfers
> ahc0: target 0 synchronous at 40.0MHz, offset =3D 0x7f
> ahc0: target 1 using 16bit transfers
> ahc0: target 1 synchronous at 40.0MHz, offset =3D 0x7f
> ahc0: target 1 using 16bit transfers
> ahc0: target 1 synchronous at 40.0MHz, offset =3D 0x7f
> ahc0: Data Parity Error Detected during address or write data phase
> sd0(ahc0:0:0): SCB 19 - timed out in Data-out phase, SEQADDR =3D=3D 0=
x5d
> SCSIRATE =3D=3D 0x93
> sd0(ahc0:0:0): BDR message in message buffer
> sd0(ahc0:0:0): no longer in timeout, status =3D 0
> sd0(ahc0:0:0): Unexpected busfree in Message-out phase
> SEQADDR =3D=3D 0x165
> sd0(ahc0:0:0): parity error detected in Data-in phase. SEQADDR(0x166)=
SCSIRATE(0x93)
> [...repeated many times...]
> sd0(ahc0:0:0): parity error detected in Data-in phase. SEQADDR(0x166)=
SCSIRATE(0x93)
> sd0(ahc0:0:0): parity error detected in Data-in phase. SEQADDR(0x166)=
SCSIRATE(0x93)
> ahc0:A:0: unknown scsi bus phase e6. Attempting to continue
> ahc0:A:0: no active SCB for reconnecting target - issuing BUS DEVICE =
RESET
> SAVED_TCL =3D=3D 0x0, ARG_1 =3D=3D 0x19, SEQ_FLAGS =3D=3D 0x0
> [...last two lines repeated many times...]
> ahc0:A:0: no active SCB for reconnecting target - issuing BUS DEVICE =
RESET
> SCSIRATE =3D=3D 0x0
> sd0(ahc0:0:0): SCB 19: Immediate reset. Flags =3D 0x4050
> sd0(ahc0:0:0): no longer in timeout, status =3D 2
> ahc0: Issued Channel A Bus Reset. 9 SCBs aborted
> ahc0: target 0 using 16bit transfers
> ahc0: target 0 synchronous at 40.0MHz, offset =3D 0x7f
> ahc0: Interrupted for status of 0???
> sd0(ahc0:0:0): queue full
> sd0(ahc0:0:0): queue full
> ahc0: Interrupted for status of 0???
> ahc0: target 1 using 16bit transfers
> ahc0: target 1 synchronous at 40.0MHz, offset =3D 0x7f
> sd0(ahc0:0:0): queue full
> sd0(ahc0:0:0): queue full
> ahc0: Interrupted for status of 0???
> sd0(ahc0:0:0): queue full
> ahc0: Interrupted for status of 0???
> sd0(ahc0:0:0): queue full
> sd0(ahc0:0:0): queue full
> ahc0: Interrupted for status of 0???
> sd0(ahc0:0:0): queue full
> ahc0: Interrupted for status of 0???
> sd0(ahc0:0:0): queue full
> sd0(ahc0:0:0): queue full
> ahc0: Interrupted for status of 0???
> sd0(ahc0:0:0): SCB 18 - timed out in Command phase, SEQADDR =3D=3D 0x=
165
> SCSIRATE =3D=3D 0x93
> sd0(ahc0:0:0): BDR message in message buffer
> sd0(ahc0:0:0): SCB 19 - timed out in Command phase, SEQADDR =3D=3D 0x=
165
> SCSIRATE =3D=3D 0x93
> sd0(ahc0:0:0): no longer in timeout, status =3D 0
> ahc0: Issued Channel A Bus Reset. 12 SCBs aborted
> ahc0: target 0 using 16bit transfers
> ahc0: target 0 synchronous at 40.0MHz, offset =3D 0x7f
> ahc0: target 1 using 16bit transfers
> ahc0: target 1 synchronous at 40.0MHz, offset =3D 0x7f
> ahc0: Interrupted for status of 0???
> ahc0: Interrupted for status of 0???
> ahc0: Interrupted for status of 0???
> ahc0: Interrupted for status of 0???
> ahc0: Interrupted for status of 0???
> sd0(ahc0:0:0): queue full
> ahc0: Interrupted for status of 0???
> sd1(ahc0:1:0): queue full
> ahc0: Interrupted for status of 0???
> sd1(ahc0:1:0): queue full
> sd0(ahc0:0:0): queue full
> sd0(ahc0:0:0): queue full
> ahc0: Interrupted for status of 0???
> ahc0: Interrupted for status of 0???
> sd1(ahc0:1:0): queue full
> ahc0: Interrupted for status of 0???
> sd0(ahc0:0:0): queue full
> ahc0: Interrupted for status of 0???
> ahc0: Interrupted for status of 0???
> sd1(ahc0:1:0): queue full
> ahc0: Interrupted for status of 0???
> sd1(ahc0:1:0): queue full
> sd0(ahc0:0:0): queue full
> sd0(ahc0:0:0): queue full
> ahc0: Interrupted for status of 0???
> ahc0: Interrupted for status of 0???
> ahc0: Interrupted for status of 0???
> sd0(ahc0:0:0): queue full
> ahc0: Interrupted for status of 0???
> ahc0: Interrupted for status of 0???
> ahc0: Interrupted for status of 0???
> sd1(ahc0:1:0): queue full
> ahc0: Interrupted for status of 0???
> sd1(ahc0:1:0): queue full
> sd0(ahc0:0:0): queue full
> sd0(ahc0:0:0): queue full
> ahc0: Interrupted for status of 0???
> ahc0: Interrupted for status of 0???
> sd0(ahc0:0:0): queue full
> ahc0: Interrupted for status of 0???
> ahc0: WARNING no command for scb 22 (cmdcmplt)
> QOUTPOS =3D 40
> sd0(ahc0:0:0): queue full
> ahc0: Interrupted for status of 0???
> panic: biodone already
> Begin traceback...
> biodone(c9e56d5c,0,0,8,c194c480) at biodone+0x2d
> scsipi_done(c195c40c,c1456000,c195c40c) at scsipi_done+0x146
> ahc_done(c1456000,c193f460) at ahc_done+0x2e7
> ahc_search_qinfifo(c1456000,0,0,0,ff) at ahc_search_qinfifo+0xfe
> ahc_freeze_devq(c1456000,c194c480,71,0,c1456000) at ahc_freeze_devq+0=
x2a
> ahc_handle_seqint(c1456000,71) at ahc_handle_seqint+0x514
> ahc_intr(c1456000) at ahc_intr+0x118
> Xintr11() at Xintr11+0x78
> --- interrupt ---
> 0x805b730:
> End traceback...
> syncing disks... sd0(ahc0:0:0): SCB 18 - timed out in Message-in phas=
e, SEQADDR =3D=3D 0xdd
> SCSIRATE =3D=3D 0x93
> sd0(ahc0:0:0): BDR message in message buffer
> ahc0:A:0: unknown scsi bus phase b6. Attempting to continue
> sd0(ahc0:0:0): SCB 19 - timed out while idle, SEQADDR =3D=3D 0x38
> SCSIRATE =3D=3D 0x0
> sd0(ahc0:0:0): no longer in timeout, status =3D 0
> ahc0: Issued Channel A Bus Reset. 6 SCBs aborted
> sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
> SCSIRATE =3D=3D 0x0
> sd0(ahc0:0:0): BDR message in message buffer
> sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
> SCSIRATE =3D=3D 0x0
> sd0(ahc0:0:0): no longer in timeout, status =3D 0
> ahc0: Issued Channel A Bus Reset. 10 SCBs aborted
> sd1(ahc0:1:0): SCB 17 - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
> SCSIRATE =3D=3D 0x0
> sd1(ahc0:1:0): Other SCB Timeout
> sd0(ahc0:0:0): SCB 1b - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
> SCSIRATE =3D=3D 0x0
> sd0(ahc0:0:0): BDR message in message buffer
> sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
> SCSIRATE =3D=3D 0x0
> sd0(ahc0:0:0): no longer in timeout, status =3D 0
> ahc0: Issued Channel A Bus Reset. 11 SCBs aborted
> sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
> SCSIRATE =3D=3D 0x0
> sd0(ahc0:0:0): BDR message in message buffer
> sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
> SCSIRATE =3D=3D 0x0
> sd0(ahc0:0:0): no longer in timeout, status =3D 0
> ahc0: Issued Channel A Bus Reset. 8 SCBs aborted
> sd1(ahc0:1:0): SCB 17 - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
> SCSIRATE =3D=3D 0x0
> sd1(ahc0:1:0): Other SCB Timeout
> sd0(ahc0:0:0): SCB 1b - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
> SCSIRATE =3D=3D 0x0
> sd0(ahc0:0:0): BDR message in message buffer
> sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
> SCSIRATE =3D=3D 0x0
> sd0: dk_busy < 0
> panic: disk_unbusy
> Begin traceback...
> disk_unbusy(c1950a2c,0,5,c9efbc3c,ed036ac8) at disk_unbusy+0x31
> sddone(c195c128) at sddone+0x4f
> scsipi_done(c195c128,c1456000,c195c128) at scsipi_done+0xfd
> ahc_done(c1456000,c193f410) at ahc_done+0x2e7
> ahc_search_qinfifo(c1456000,ffffffff,41,ffffffff,ff) at ahc_search_qi=
nfifo+0xfe
> ahc_abort_scbs(c1456000,ffffffff,41,ffffffff,ff) at ahc_abort_scbs+0x=
60
> ahc_reset_channel(c1456000,41,1,7fffffff,c193f3e8) at ahc_reset_chann=
el+0x2c5
> ahc_timeout(c193f3e8) at ahc_timeout+0x284
> softclock(c1955d4c,0,ffffffff,c0198328,ecd49b88) at softclock+0x121
> Xsoftclock() at Xsoftclock+0xf
> --- interrupt ---
> param.c(ef06a45c,ecd49b88,10,40e,308c9d) at 0x3144b80
> End traceback...
>
> dumping to dev 4,1 offset 23855
> dump device bad
>
>
> --
> =09 David/absolute=09=09abs@formula1.com
>
--------------------------------------------------------------------
Charakterystycznymi cechami rozwoju oprogramowania jest wyk=B3adniczy
wzrost wymaga=F1 sprz=EAtowych, kwadratowy wzrost ilo=B6ci b=B3=EAd=F3w, li=
niowy
wzrost ilo=B6ci bajer=F3w przy mniej ni=BF liniowym wzro=B6cie funkcjonalno=
=B6ci