Subject: Re: fxp on i82559 still has some timeout problems, even at 10baseT
To: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
From: Greg A. Woods <woods@weird.com>
List: tech-kern
Date: 11/10/2003 20:33:35
[ On Saturday, November 8, 2003 at 21:59:38 (+0900), Izumi Tsutsui wrote: ]
> Subject: Re: How can I help with hung vnlock()'ed clients?
>
> Does the attached patch (the idea taken from OpenBSD) fix
> your "fxp0: device timeout" problem?
I was just about to report that your changes had seemed to fix the
problem for me with the on-board fxp0 on the Intel STL/2 motherboard
when suddenly it spewed the timeout error and the NFS mount point I was
copying from has hung again. The pattern of error messages seems
somewhat different this time (although this time I didn't enable the
link0 microcode feature and that may be why I have not yet seen the
"dmasync timeout" errors this time):
Nov 10 20:04:27 always /netbsd: fxp0: WARNING: SCB timed out!
Nov 10 20:04:33 always last message repeated 5 times
Nov 10 20:04:38 always /netbsd: fxp0: device timeout
Nov 10 20:04:48 always /netbsd: fxp0: WARNING: SCB timed out!
Nov 10 20:04:53 always /netbsd: fxp0: device timeout
Nov 10 20:05:20 always /netbsd: fxp0: WARNING: SCB timed out!
Nov 10 20:05:22 always last message repeated 2 times
Nov 10 20:05:24 always /netbsd: nfs server proven.weird.com:/build: not responding
Nov 10 20:05:24 always /netbsd: fxp0: WARNING: SCB timed out!
Nov 10 20:05:29 always /netbsd: fxp0: device timeout
Nov 10 20:05:36 always /netbsd: fxp0: WARNING: SCB timed out!
Nov 10 20:05:39 always /netbsd: fxp0: WARNING: SCB timed out!
Nov 10 20:05:44 always /netbsd: fxp0: device timeout
Nov 10 20:06:39 always /netbsd: fxp0: WARNING: SCB timed out!
Nov 10 20:06:44 always /netbsd: fxp0: device timeout
Nov 10 20:07:16 always /netbsd: fxp0: WARNING: SCB timed out!
Nov 10 20:08:37 always last message repeated 38 times
Nov 10 20:11:59 always last message repeated 40 times
Nov 10 20:12:04 always /netbsd: fxp0: device timeout
Nov 10 20:15:10 always /netbsd: fxp0: WARNING: SCB timed out!
It did _seem_ to last a lot longer and through much heavier traffic
conditions than before. Of course the second fxp, an i82550, I have on
this same system can easily work an order of magnitude longer and has
never failed in any way.
At this point the interface is still working sporadically enough for
some TCP traffic and I can rlogin:
# netstat -in -I fxp0
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Colls
fxp0 1500 <Link> 00:d0:b7:b6:ad:4b 1716551 0 760047 6 489472
fxp0 1500 fe80::/64 fe80::2d0:b7ff:fe 1716551 0 760047 6 489472
fxp0 1500 204.92.254 204.92.254.7 1716551 0 760047 6 489472
The number of "Oerrs" seems to correspond exactly, as expected, to the
number of "device timeout" messages from the kernel.
(the collision rate is high because I was doing 'ping -f' while copying
from NFS and I'm still running at only 10baseT to my switch)
# ifconfig fxp0
fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
address: 00:d0:b7:b6:ad:4b
media: Ethernet autoselect (10baseT)
status: active
inet 204.92.254.7 netmask 0xffffff00 broadcast 204.92.254.255
inet6 fe80::2d0:b7ff:feb6:ad4b%fxp0 prefixlen 64 scopeid 0x1
# uname -a
NetBSD always 1.6.2_RC1 NetBSD 1.6.2_RC1 (GENERIC) #5: Mon Nov 10 17:41:56 EST 2003 woods@proven:/build/woods/proven/NetBSD-1.6.x-i386-i386-obj/work/woods/m-NetBSD-1.6/sys/arch/i386/compile/GENERIC i386
FYI this device probes as:
fxp0 at pci0 dev 3 function 0: i82559 Ethernet, rev 8
fxp0: interrupting at irq 10
fxp0: Ethernet address 00:d0:b7:b6:ad:4b
inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
Back to the PCI add-on fxp1 for this system.....
Note that doing so restores the NFS mount to working condition.
Of course NFS still shouldn't hang, at least not so permanently, given
the mount options include '-i':
proven.weird.com:/build /proven/build nfs -b,-i,rw,nodev,nosuid 0 0
I'll leave the tests I was running in a loop overnight on the i82550
(fxp1 on this box) just to make sure it still really does run better
than fxp0 has.
--
Greg A. Woods
+1 416 218-0098 VE3TCP RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com> Secrets of the Weird <woods@weird.com>