Subject: Re: 2.0 fxp timeouts
To: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
From: Stephen Jones <smj@cirr.com>
List: port-alpha
Date: 01/21/2005 18:39:35
I've applied the patch to a non-multicpu kernel, but unfortunately the
issue persists.
fxp1: WARNING: SCB timed out!
fxp1: WARNING: SCB timed out!
fxp1: device timeout
What is interesting is that the fxp1 interface is a public network
interface
while the fxp0 interface (which does much more in the way of traffic)
is a
back end nfs interface.
Its been up just after an hour with the patched kernel (I used the Jan
15th
source from the NetBSD-2-0-release directory)
Name Mtu Network Address Ipkts Ierrs Opkts
Oerrs Colls
fxp0 1500 <Link> 00:02:56:00:0f:ad 119647 0 132672
0 0
fxp0 1500 10/24 vinland1 119647 0 132672
0 0
fxp1 1500 <Link> 00:02:56:00:0f:ae 12295 0 10323
1 0
fxp1 1500 192.94.73/24 vinland.freeshell 12295 0 10323
1 0
On Jan 21, 2005, at 6:31 AM, Izumi Tsutsui wrote:
> In article <200501142046.j0EKkTdd021953@egsner.cirr.com>
> smj@cirr.com wrote:
>
>> fxp1: WARNING: SCB timed out!
>> fxp1: device timeout
>
> How about the attached patch?
> ---
> Izumi Tsutsui
> tsutsui@ceres.dti.ne.jp
>
> --- i82557.c.orig 2005-01-19 00:24:59.000000000 +0900
> +++ i82557.c 2005-01-19 00:25:10.000000000 +0900
> @@ -916,7 +916,7 @@
> break;
> m = NULL;
>
> - if (sc->sc_txpending == FXP_NTXCB) {
> + if (sc->sc_txpending == FXP_NTXCB - 1) {
> FXP_EVCNT_INCR(&sc->sc_ev_txstall);
> break;
> }
> @@ -1070,7 +1070,7 @@
> #endif
> }
>
> - if (sc->sc_txpending == FXP_NTXCB) {
> + if (sc->sc_txpending == FXP_NTXCB - 1) {
> /* No more slots; notify upper layer. */
> ifp->if_flags |= IFF_OACTIVE;
> }
> @@ -1087,9 +1087,23 @@
> * Cause the chip to interrupt and suspend command
> * processing once the last packet we've enqueued
> * has been transmitted.
> + *
> + * To avoid a race between updating status bits
> + * by the fxp chip and clearing command bits
> + * by this function on machines which don't have
> + * atomic methods to clear/set bits in memory
> + * smaller than 32bits (both cb_status and cb_command
> + * members are uint16_t and in the same 32bit word),
> + * we have to prepare a dummy TX descriptor which has
> + * NOP command and just causes a TX completion interrupt.
> */
> - FXP_CDTX(sc, sc->sc_txlast)->txd_txcb.cb_command |=
> - htole16(FXP_CB_COMMAND_I | FXP_CB_COMMAND_S);
> + sc->sc_txpending++;
> + sc->sc_txlast = FXP_NEXTTX(sc->sc_txlast);
> + txd = FXP_CDTX(sc, sc->sc_txlast);
> + /* BIG_ENDIAN: no need to swap to store 0 */
> + txd->txd_txcb.cb_status = 0;
> + txd->txd_txcb.cb_command = htole16(FXP_CB_COMMAND_NOP |
> + FXP_CB_COMMAND_I | FXP_CB_COMMAND_S);
> FXP_CDTXSYNC(sc, sc->sc_txlast,
> BUS_DMASYNC_PREREAD|BUS_DMASYNC_PREWRITE);
>
> @@ -1221,6 +1235,11 @@
> FXP_CDTXSYNC(sc, i,
> BUS_DMASYNC_POSTREAD|BUS_DMASYNC_POSTWRITE);
>
> + /* skip dummy NOP TX descriptor */
> + if ((le16toh(txd->txd_txcb.cb_command) & FXP_CB_COMMAND_CMD)
> + == FXP_CB_COMMAND_NOP)
> + continue;
> +
> txstat = le16toh(txd->txd_txcb.cb_status);
>
> if ((txstat & FXP_CB_STATUS_C) == 0)
> --- i82557reg.h.orig 2005-01-19 00:25:22.000000000 +0900
> +++ i82557reg.h 2005-01-19 00:25:26.000000000 +0900
> @@ -368,6 +368,7 @@
> #define FXP_CB_STATUS_C 0x8000
>
> /* commands */
> +#define FXP_CB_COMMAND_CMD 0x0007 /* XXX how about FXPF_IPCB case? */
> #define FXP_CB_COMMAND_NOP 0x0
> #define FXP_CB_COMMAND_IAS 0x1
> #define FXP_CB_COMMAND_CONFIG 0x2
>