Subject: Re: NetBSD and large pps
To: None <tls@rek.tjls.com>
From: Mihai CHELARU <kefren@netbastards.org>
List: tech-net
Date: 12/03/2004 23:36:08
Hello Thor,
Thor Lancelot Simon wrote:
> On Fri, Dec 03, 2004 at 11:02:22AM +0200, Mihai CHELARU wrote:
>
>>Software tweaks:
>> - HZ 1000
>>
>>So, I have 4000 IRQs/sec generated by scheduler. Rest of IRQs/sec is
>
>
> I'm a little confused by this. If you've set HZ=1000 (which is a very
> bad thing to set it to; the table for doing quick time computations based
> on HZ has an entry for 1024, but not for 1000), why are you getting *4000*
> interrupts per second?
>
2 CPUs and hyperthreading. This is what `systat vmstat 1` is reporting
me. 1000 irqs/sec for each of the for CPUs. So a total of 4000 IRQs/sec.
Good to know about that quick time computations table at 1024, I'll
modify ASAP.
>
>>NAPI means RX polling, meaning that not for every packet received there
>>is an IRQ generated. Don't know more, this is what I understood from
>
>
> This is a perfect example of why it's a really bad idea to use marketing
> terms in technical discussion.
Sorry, I'm a bit lost when discussing about PCs and usual
servers/workstations in general and their lousy architecure :) This is
what I found on google about the RX offload you describe here.
> By invoking the amorphous "NAPI", you
> have confused two different techniques, interrupt pacing (a.k.a. interrupt
> "coalescing") and strict polling. Both trade latency for throughput; the
> latter requires moderately painful support in the kernel but will work with
> any network card, the former requires hardware support in the interface
> card but little support in the kernel.
>
> Interrupt pacing or coalescing just means that the card buffers packets
> internally and only generates one interrupt every N packets, usually with
> a timer so that it generates an interrupt at least every N microseconds
> (this puts an upper bound on latency). The wm and bge hardware supports
> this (for that matter, so do tlp and lots of other older cards) but the
> tricky thing is setting the interrupt threshold and maximum-latency timer
> correctly for your application. Jonathan has just given you suggestions
> on how to set these thresholds better for what you're doing.
>
^^^^^^^^^^^^^
This is what I was reffering to :)
> Polling just ignores network interrupts completely, and enforces a strict
> latency/throughput trade-off by reading from the network device according
> to a timer. This avoids interrupt-service overhead, at the expense of
> significant software complexity and of always making the _worst-case_
> latency decision, rather than treating the increased latency as an upper
> bound.
>
> Basically, if we knew how to set the coalescing thresholds and timers
> automatically, and we could get our interrupt code efficient enough,
> the first approach would always win, given cards that support it. But
> we don't, and one advantage of polling (which, mind you, a stock NetBSD
> kernel can't do) is that at least there's only one value to adjust, the
> timer value according to which we read from the card.
>
> I still don't know what "NAPI" is, but hopefully this will help you
> understand the actual technical issues at work here.
>
>
It did, thank you.