Subject: Re: How can I help with hung vnlock()'ed clients?
To: None <art@riverstonenet.com>
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
List: tech-kern
Date: 11/08/2003 21:59:38
In article <20031105020655.GA58185@ego.yagosys.com>
art@riverstonenet.com wrote:
> The problem usually happens on the box with Intel's i82559 NIC
> (if_fxp). The NIC in question occasionally gets stuck for about a
> minute, prints "fxp0: device timeout" on the console and starts
> working again. This may or may not have something to do with the
> problem.
Does the attached patch (the idea taken from OpenBSD) fix
your "fxp0: device timeout" problem?
(NFS vnlock problem still occurs with this patch on my sparc64 though)
---
Izumi Tsutsui
tsutsui@ceres.dti.ne.jp
Index: dev/ic/i82557.c
===================================================================
RCS file: /cvsroot/src/sys/dev/ic/i82557.c,v
retrieving revision 1.77
diff -u -r1.77 i82557.c
--- dev/ic/i82557.c 2 Nov 2003 11:07:45 -0000 1.77
+++ dev/ic/i82557.c 8 Nov 2003 12:54:50 -0000
@@ -888,7 +888,7 @@
break;
m = NULL;
- if (sc->sc_txpending == FXP_NTXCB) {
+ if (sc->sc_txpending == FXP_NTXCB - 1) {
FXP_EVCNT_INCR(&sc->sc_ev_txstall);
break;
}
@@ -1026,7 +1026,7 @@
#endif
}
- if (sc->sc_txpending == FXP_NTXCB) {
+ if (sc->sc_txpending == FXP_NTXCB - 1) {
/* No more slots; notify upper layer. */
ifp->if_flags |= IFF_OACTIVE;
}
@@ -1043,9 +1043,23 @@
* Cause the chip to interrupt and suspend command
* processing once the last packet we've enqueued
* has been transmitted.
+ *
+ * To avoid a race between updating status bits
+ * by the fxp chip and clearing command bits
+ * by this function on machines which don't have
+ * atomic methods to clear/set bits in memory
+ * smaller than 32bits (both cb_status and cb_command
+ * members are u_int16_t and in the same 32bit word),
+ * we have to prepare a dummy TX descriptor which has
+ * NOP command and just causes a TX completion interrupt.
*/
- FXP_CDTX(sc, sc->sc_txlast)->txd_txcb.cb_command |=
- htole16(FXP_CB_COMMAND_I | FXP_CB_COMMAND_S);
+ sc->sc_txpending++;
+ sc->sc_txlast = FXP_NEXTTX(sc->sc_txlast);
+ txd = FXP_CDTX(sc, sc->sc_txlast);
+ /* BIG_ENDIAN: no need to swap to store 0 */
+ txd->txd_txcb.cb_status = 0;
+ txd->txd_txcb.cb_command = htole16(FXP_CB_COMMAND_NOP |
+ FXP_CB_COMMAND_I | FXP_CB_COMMAND_S);
FXP_CDTXSYNC(sc, sc->sc_txlast,
BUS_DMASYNC_PREREAD|BUS_DMASYNC_PREWRITE);
@@ -1174,6 +1188,11 @@
FXP_CDTXSYNC(sc, i,
BUS_DMASYNC_POSTREAD|BUS_DMASYNC_POSTWRITE);
+
+ /* skip dummy NOP TX descriptor */
+ if ((le16toh(txd->txd_txcb.cb_command) & FXP_CB_COMMAND_CMD)
+ == FXP_CB_COMMAND_NOP)
+ continue;
txstat = le16toh(txd->txd_txcb.cb_status);
Index: dev/ic/i82557reg.h
===================================================================
RCS file: /cvsroot/src/sys/dev/ic/i82557reg.h,v
retrieving revision 1.13
diff -u -r1.13 i82557reg.h
--- dev/ic/i82557reg.h 2 Nov 2003 10:50:40 -0000 1.13
+++ dev/ic/i82557reg.h 8 Nov 2003 12:54:51 -0000
@@ -368,6 +368,7 @@
#define FXP_CB_STATUS_C 0x8000
/* commands */
+#define FXP_CB_COMMAND_CMD 0x0007 /* XXX how about FXPF_IPCB case? */
#define FXP_CB_COMMAND_NOP 0x0
#define FXP_CB_COMMAND_IAS 0x1
#define FXP_CB_COMMAND_CONFIG 0x2