Subject: Re: NetBSD and large pps
To: None <tls@rek.tjls.com>
From: Mihai CHELARU <kefren@netbastards.org>
List: tech-net
Date: 12/03/2004 23:36:08
Hello Thor,

Thor Lancelot Simon wrote:
> On Fri, Dec 03, 2004 at 11:02:22AM +0200, Mihai CHELARU wrote:
> 
>>Software tweaks:
>>	- HZ 1000
>>
>>So, I have 4000 IRQs/sec generated by scheduler. Rest of IRQs/sec is 
> 
> 
> I'm a little confused by this.  If you've set HZ=1000 (which is a very
> bad thing to set it to; the table for doing quick time computations based
> on HZ has an entry for 1024, but not for 1000), why are you getting *4000*
> interrupts per second?
> 

2 CPUs and hyperthreading. This is what `systat vmstat 1` is reporting 
me. 1000 irqs/sec for each of the for CPUs. So a total of 4000 IRQs/sec. 
Good to know about that quick time computations table at 1024, I'll 
modify ASAP.

> 
>>NAPI means RX polling, meaning that not for every packet received there 
>>is an IRQ generated. Don't know more, this is what I understood from 
> 
> 
> This is a perfect example of why it's a really bad idea to use marketing
> terms in technical discussion.  

Sorry, I'm a bit lost when discussing about PCs and usual 
servers/workstations in general and their lousy architecure :) This is 
what I found on google about the RX offload you describe here.

> By invoking the amorphous "NAPI", you
> have confused two different techniques, interrupt pacing (a.k.a. interrupt
> "coalescing") and strict polling.  Both trade latency for throughput; the
> latter requires moderately painful support in the kernel but will work with
> any network card, the former requires hardware support in the interface
> card but little support in the kernel.
> 
> Interrupt pacing or coalescing just means that the card buffers packets
> internally and only generates one interrupt every N packets, usually with
> a timer so that it generates an interrupt at least every N microseconds
> (this puts an upper bound on latency).  The wm and bge hardware supports
> this (for that matter, so do tlp and lots of other older cards) but the
> tricky thing is setting the interrupt threshold and maximum-latency timer
> correctly for your application.  Jonathan has just given you suggestions
> on how to set these thresholds better for what you're doing.
> 

^^^^^^^^^^^^^
This is what I was reffering to :)

> Polling just ignores network interrupts completely, and enforces a strict
> latency/throughput trade-off by reading from the network device according
> to a timer.  This avoids interrupt-service overhead, at the expense of
> significant software complexity and of always making the _worst-case_
> latency decision, rather than treating the increased latency as an upper
> bound.
> 
> Basically, if we knew how to set the coalescing thresholds and timers
> automatically, and we could get our interrupt code efficient enough,
> the first approach would always win, given cards that support it.  But
> we don't, and one advantage of polling (which, mind you, a stock NetBSD
> kernel can't do) is that at least there's only one value to adjust, the
> timer value according to which we read from the card.
> 
> I still don't know what "NAPI" is, but hopefully this will help you
> understand the actual technical issues at work here.
> 
> 

It did, thank you.