tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: RFC: softint-based if_input



On Mon, Jan 25, 2016 at 3:53 PM, Ryota Ozaki <ozaki-r%netbsd.org@localhost> wrote:
> On Mon, Jan 25, 2016 at 1:06 PM, Taylor R Campbell
> <campbell+netbsd-tech-kern%mumble.net@localhost> wrote:
>>    Date: Mon, 25 Jan 2016 11:25:16 +0900
>>    From: Ryota Ozaki <ozaki-r%netbsd.org@localhost>
>>
>>    On Tue, Jan 19, 2016 at 2:22 PM, Ryota Ozaki <ozaki-r%netbsd.org@localhost> wrote:
>>    (snip)
>>    >> (a) a per-CPU pktq that never distributes packets to another CPU, or
>>    >> (b) a single-CPU pktq, to be used only from the CPU to which the
>>    >> device's (queue's) interrupt handler is bound.
>>    >>
>>    > I'll rewrite the patch as your suggestion (I prefer (a) for now).
>>
>>    Through rewriting it, I feel that it seems to be a lesser version of
>>    pktqueue. So I think it may be better changing pktqueue to have a flag
>>    to not distribute packets between CPUs than implementing another one
>>    duplicating pktqueue. Here is a patch with the approach:
>>    http://www.netbsd.org/~ozaki-r/pktq-without-ipi.diff
>>
>>    If we call pktq_create with PKTQ_F_NO_DISTRIBUTION, pktqueue doesn't
>>    setup IPI for softint and never call softint_schedule_cpu (i.e.,
>>    never distribute packets).
>>
>>    How about the approach?
>>
>> Some disjointed thoughts:
>>
>> 1. I don't think you actually need to change pktq(9).  It looks like
>> if you pass in cpu_index(curcpu()) for the hash, it will consistently
>> use the current CPU, for which softint_schedule_cpu has a special case
>> that avoids ipi.  So I don't expect it's substantially different from
>> <https://www.netbsd.org/~ozaki-r/softint-if_input.diff> -- though
>> maybe measurements will show my analysis is wrong!
>
> My intention is to prevent ipi_register in pktq_create and
> so we don't need ipi_sysinit movement...
>
>>
>> 2. Even though you avoid ipi(9), you're still using pcq(9), which
>> requires interprocessor synchronization -- but that is an unnecessary
>> cost because you're simply passing packets from hardintr to softintr
>> context on a single CPU.  So that's why I specifically suggested ifq,
>> not pcq or pktqueue.
>
> ...though, right. membars in pcq(9) are just overhead.
>
> Okay, I'll implement softint + percpu irqs.

Here it is: http://www.netbsd.org/~ozaki-r/softint-if_input-ifqueue.diff

Results of performance measurements of it are also added to
https://gist.github.com/ozaki-r/975b06216a54a084debc

The results are good but bothers me; it achieves better performance
than vanilla (and the 1st implementation) on high load (IP forwarding).
For fast forward, it also beats the 1st one.

I thought that holding splnet during ifp->if_input (splnet is needed
for ifqueue operations and so keep holding in the patch) might affect
the results. So I tried to release during ifp->if_input but the results
didn't change so much (the result of IP forwarding is still better
than vanilla).

Anyone have any ideas?

  ozaki-r


Home | Main Index | Thread Index | Old Index