tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: percpu_foreach() does not execute remotely
[Cc'ing the wizards at IIJ because a diagnostic measure we're
discussing may require adapting the network stack.]
> Date: Wed, 29 Jan 2020 05:43:39 -0800
> From: Jason Thorpe <thorpej%me.com@localhost>
>
> > On Jan 28, 2020, at 9:46 PM, Taylor R Campbell <campbell+netbsd-tech-kern%mumble.net@localhost> wrote:
> >
> > Normally sleeping under percpu_getref -- even on an adaptive lock --
> > is a no-no for precisely this reason. (percpu_getref, while very nice
> > in some ways, is an extremely delicate abstraction! It would be nice
> > if violating this rule triggered an immediate panic.)
>
> My point was about the percpu_swap_lock ... percpu_foreach() is
> protected from this scenario by its use of that lock,
> percpu_getref() is not. Maybe we want to consider adding panics in
> the cases where a thread goes to sleep while preemption is disabled?
Sounds reasonable, although it's not immediately clear to me that
kpreempt_disable is the right criterion -- e.g., kpreempt_disable is
mandatory to call mi_switch, so this might cause trouble in the
scheduler, and there may be other logic out there that relies on
preventing _preemption_ but deliberately _sleeps_.
Perhaps we should have an LWP or CPU flag saying we're currently in
the middle of percpu_getref/putref; we can then add another case to
assert_sleepable in kern/kern_lock.c and an assertion in mi_switch.
However, there's some things we might have to adapt to avoid
spuriously triggering this (some of which already look a little
questionable to me):
net/if.h if_tunnel_get_ro:
percpu_getref then mutex_enter on an adaptive lock. I think this is
not a problem right now because we never use the percpu pointer
after mutex_enter.
I think it should be enough to move percpu_putref out of
if_tunnel_put_ro and into if_tunnel_get_ro -- holding the lock
should be enough to serialize access to the struct route.
net/if_l2tp.c l2tp_ifq_percpu_getref:
Held across if_tunnel_get_ro, via in6_l2tp_output, which might sleep
on an adaptive lock.
I think it should be enough to move percpu_putref out of
l2tp_ifq_percpu_putref and into l2tp_ifq_percpu_getref (perhaps
change the name for clarity), and to use curlwp_bind/bindx instead.
net/route.h rtcache_percpu_getref:
Held across large subsystems, too many for me to examine -- e.g.,
ip_output, which might acquire the kernel lock which might sleep.
Since the percpu pointer is never used after it returns, though, I
think only curlwp_bind/bindx is needed here.
netinet/wqinput.c wqinput_percpu_getref:
Held across (a) softnet_lock, which is adaptive, in wqinput_work;
and across (b) pool_get on an IPL_NONE pool, which also takes an
adaptive lock, in wqinput_input.
I think (a) is technically OK now because wqinput_work is bound to a
CPU anyway, so access to wwl should be serialized on that CPU (no
sleeps in wqinput_work_get), but (b) might be a problem unless we
can prove wqinput_input is bound to a CPU, which I ran out of juice
to do.
In any case, I think the percpu_putref can again be moved into
wqinput_percpu_getref (and perhaps the name should be changed for
clarity), since we never use the percpu pointer again -- but we
should either add curlwp_bind/bindx or comment on a proof that
wqinput_input is already bound to the CPU.
> IF we did that, then we should also disable preemption after
> acquiring percpu_swap_lock to catch bad behavior by the
> percpu_foreach() callback.
I don't think so, because it's OK for the percpu_foreach callback to
mutex_enter/exit on an adaptive lock even though that might sleep.
What's not OK is sleeping for allocation or other long durations.
Merely holding percpu_swap_lock during percpu_foreach is enough to
prevent concurrent percpu_cpu_swap.
Home |
Main Index |
Thread Index |
Old Index