tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: RFC: softint-based if_input
Date: Mon, 25 Jan 2016 11:25:16 +0900
From: Ryota Ozaki <ozaki-r%netbsd.org@localhost>
On Tue, Jan 19, 2016 at 2:22 PM, Ryota Ozaki <ozaki-r%netbsd.org@localhost> wrote:
(snip)
>> (a) a per-CPU pktq that never distributes packets to another CPU, or
>> (b) a single-CPU pktq, to be used only from the CPU to which the
>> device's (queue's) interrupt handler is bound.
>>
> I'll rewrite the patch as your suggestion (I prefer (a) for now).
Through rewriting it, I feel that it seems to be a lesser version of
pktqueue. So I think it may be better changing pktqueue to have a flag
to not distribute packets between CPUs than implementing another one
duplicating pktqueue. Here is a patch with the approach:
http://www.netbsd.org/~ozaki-r/pktq-without-ipi.diff
If we call pktq_create with PKTQ_F_NO_DISTRIBUTION, pktqueue doesn't
setup IPI for softint and never call softint_schedule_cpu (i.e.,
never distribute packets).
How about the approach?
Some disjointed thoughts:
1. I don't think you actually need to change pktq(9). It looks like
if you pass in cpu_index(curcpu()) for the hash, it will consistently
use the current CPU, for which softint_schedule_cpu has a special case
that avoids ipi. So I don't expect it's substantially different from
<https://www.netbsd.org/~ozaki-r/softint-if_input.diff> -- though
maybe measurements will show my analysis is wrong!
2. Even though you avoid ipi(9), you're still using pcq(9), which
requires interprocessor synchronization -- but that is an unnecessary
cost because you're simply passing packets from hardintr to softintr
context on a single CPU. So that's why I specifically suggested ifq,
not pcq or pktqueue.
3. Random thought: If we do polling, I wonder whether instead of (or
in addition to) polling for up to (say) 100 packets in a softint, we
really ought to poll for arbitrarily many packets in a kthread with
KTHREAD_TS, so that we don't need to go back and forth between
hardintr/softintr during high throughput, but we also don't starve
user threads in that case.
I seem to recall starvation of user threads is what motivated matt@ to
split packet processing between a softint and a workqueue, depending
on the load, in bcmeth(4) (sys/arch/arm/broadcom/bcm53xx_eth.c).
Maybe he can comment on this? Have you studied how this driver works,
and maybe pq3etsec(4) too, which also does polling?
Home |
Main Index |
Thread Index |
Old Index