Subject: Re: Transmit interrupts, fxp driver for Intel 82557/8/9 Ethernet
To: None <current-users@netbsd.org>
From: Hal Murray <murray@pa.dec.com>
List: current-users
Date: 05/24/2000 01:20:42
Sorry I didn't get back to this sooner.
> Actually, yes, the fxp driver *does* use transmit interrupts. If you
> take a look at fxp_start(), at the end of that function:
Argh/blush. I was working with 1.4.2 rather than current. Sorry
for any confusion.
Is there a better list to discuss this on?
Is there going to be a 1.4.3? If so, is this quirk worth chasing/fixing?
Long message warning. Here goes...
-----
I've been hacking/thrashing.
All my stuff got a lot better when I changed the loop control at
the top of fxp_start from:
while (ifp->if_snd.ifq_head != NULL && sc->tx_queued < FXP_NTXCB) {
to:
while (ifp->if_snd.ifq_head != NULL && sc->tx_queued < (FXP_NTXCB-1)) {
That is I changed it to use only 127 out of 128 slots.
But I can't figure out why 128 doesn't work.
Every time I try using all 128 slots, something nasty/obscure happens.
When I use only 127, everything works as expected. It fixes both
the big-TCP-window glitch and the UDP-blast-em case.
That might explain this observation too:
> We have a NetBSD 1.4.1/i386 box being used as a generic router on
> our network and we were running into "fxp0 timeout" problems every
> few days (but with seemingly random intervals between), but especially
> during peak traffic periods. It was enough to make the box unreliable
> as a router so we switched to ex cards instead. Peak traffic was
> probably around 4-5Mb/s bidirectional with about 1000pps per direction
> being forwarded (just a ballpark estimate), standard mix of realworld
> net traffic.
I'm hoping that somebody familiar with the hardware/driver can see
why 128 doesn't work by looking at the code. I've tried several
times and can't find anything.
I've patched the watchdog code to print more info:
printf("%s: device timeout, TXQ=%d, SND=%d\n",
sc->sc_dev.dv_xname, sc->tx_queued, ifp->if_snd.ifq_len);
From the tail of dmesg after running the TCP big window case:
de0: enabling 100baseTX port
fxp1: device timeout, TXQ=128, SND=9
fxp1: device timeout, TXQ=128, SND=6
fxp1: device timeout, TXQ=128, SND=6
...
Humm. I hadn't noticed this before, but the effective transmit queue
size is actually 2*FXP_NTXCB. There can be FXP_NTXCB mbufs on the
ifp->if_snd queue and another FXP_NTXCB mbufs that have been setup
on the hardware control blocks. The packets the hardware knows about
have been dequeued from ifp->if_snd so they don't get counted there.
-----
I've been thinking about what is the right way to handle transmit
interrupts.
None clearly doesn't work right in the special case of transmit-only.
That's pretty unlikely in real life when anything interesting is
going on. (Humm. Suppose I have 2 systems connected by a hub and
the other machine gets powered off.)
Interrupt-on-every-packet is easy to understand and simple to code,
but most of the time, it wastes CPU cycles on transmit interrupt
overhead.
The -current driver does an interrupt on the last packet of the chain.
For short packets, that's the same as interrupt on every packet.
(For the machines I'm using, the mode shift happens at around 200
bytes.)
The FreeBSD driver requests an interrupt when it gets near the end
of the queue. (It uses 120 out of 128.) This seems reasonable.
The idea for interrupting before the end is to keep the hardware
running at full speed. [I can't see an easy way to think about what
happens when the queue gets full to make interrupts and there is
receive traffic that is cleaning up the front of the transmit queue.
I think it generates some interrupts but the exact number will depend
upon timings.]
I've been testing with a scheme that requests an interrupt on the
last packet if the queue is over 1/2 full. This was easy to code.
I was going to say that only case where this won't work right/cleanly
was when you send a small clump and then nothing else happens before
the watchdog goes off. But there is another interesting case.
Consider the normal UDP blast-em case. My test code sends as fast
as it can until it gets an error. Then it sleeps for a tick. The
interrupt on the last packet will happen somewhere between ticks.
Between the interrupt and the next tick, the transmit queue is empty.
I see this with small packets - 38K packets/second with some unused
CPU vs 50K and CPU saturated when interrupting on every packet or
interrupting on the last packet. This matches what I expect.
For large packets, I'd expect the throughput to be significantly
less than wire speed. But it's going at (very) close to wire speed.
Ahh. I see it now. There are 128 more packets waiting on ifp->if_snd
That's enough to cover the gaps until the timer ticks and my test
code gets woken up again.
-----
> With the driver in NetBSD-current, I am getting 85-90Mb/s w/ the
> fxp driver between similar machines, tho I'll see if I can reproduce
> this problem with a > 195K window.
I consistently get 90 megabits with NetBSD on a full duplex link,
either point-point or through a switch.
That's when running below the 195K cliff. The dropoff in throughput
is pretty drastic. The critical window size might depend slightly
on the CPU speed or number of switches/routers in the path.