Re: workqueue for pr

Apologies for the earlier email that was one big text blob...

On 23/01/2017 7:58 PM, Ryota Ozaki wrote:

Hi,

I propose a controversial change that makes some
pr_input workqueue: icmp_input, icmp6_input and
carp_input.

...

Since you are redirecting icmp6_input()...

In response to (say) a ping flood, how large can or will the work queue
grow?Or unreachable flood?

How does this change prevent a DoS attack against NetBSD kernel memory
through ICMP message flooding of ICMP6 and growing the workqueue?

Granted there are certain ICMP6 messages that are more complex to deal
with that will benefit from this change but there are also a number of
ICMP6 messages for which this change means "nothing" (the 30usec increase
in time to respond to ping is not what I'd call meaningful.) Do they all
need to be handled in this fashion when the above attack scenario is
considered? Is it worth being able to distinguish using this input function
for handling ICMP6 messages that are local to the directly attached subnet
only vs those that are not
local to the subnet?

And a comment for workqueue, If I declare a work queue like this:
+ error = workqueue_create(&icmp6_input_wq, "icmp6_input",
+ icmp6_input_work, NULL, PRI_SOFTNET, IPL_SOFTNET,WQ_MPSAFE);

then the work queue function like this:
+icmp6_input_work(struct work *wk, void *arg)
+{
...
+ s = splsoftnet();

should not need a second explicit reference to an already nominated SPL.
This should be something like:

s = workqueue_spl(w);

"But the IPL_SOFTNET is for something else." Sure, some workqueue instances
may use a different IPL inside the work to that used to that in the process
of doing work.

(Why does workqueue_init() not use the "ipl" arg? subr_workqueue.c v1.33)

On a similar note this:
+ s = splsoftnet();
+ mutex_enter(softnet_lock);

might be better done as this:

s = splsoftnet_lock()

but then how does this work:
+ mutex_exit(softnet_lock);
+ splx(s);
(for which I don't have a better answer than splxsoftnet_unlock() and
 I don't know that this is even a good answer. Anyway, back to the task at hand...)

In terms of flooding, this change introduces at least four new memory bus
synchronisation events per input packet. In a push for lock-less networking,
is this desirable (even if the packets are not processed in-line)?

And that's not including SPL changes or the locking inside workqueue.

A way to halve that would be for icmp6_input_work to simply take the entire
chain of work at the start of icmp6_input_work() and work on that privately,
rather than do each item individually. Nothing else is going to look at what's
on the work queue so there should be no problem. Look at workqueue_worker()
for an example of how to save on locking and if the abstraction violation
comment is anything to go by then maybe a new set of interfaces is required
for <sys/queue.h> to support queues, etc, being copied privately and then
reset. Or maybe there are other ways to approach this problem with changes
to workqueue but the way the patch is not the best.

How would you implement this change without adding any new locks to icmp6.c?

Cheers, Darren

Re: workqueue for pr_input