Thread (38 messages) 38 messages, 5 authors, 2024-05-24

Re: [PATCH v2 net-next 00/15] locking: Introduce nested-BH locking.

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: 2024-05-06 09:38:32
Also in: lkml

On 2024-05-06 10:43:49 [+0200], Paolo Abeni wrote:
On Fri, 2024-05-03 at 20:25 +0200, Sebastian Andrzej Siewior wrote:
quoted
Disabling bottoms halves acts as per-CPU BKL. On PREEMPT_RT code within
local_bh_disable() section remains preemtible. As a result high prior
tasks (or threaded interrupts) will be blocked by lower-prio task (or
threaded interrupts) which are long running which includes softirq
sections.

The proposed way out is to introduce explicit per-CPU locks for
resources which are protected by local_bh_disable() and use those only
on PREEMPT_RT so there is no additional overhead for !PREEMPT_RT builds.
Let me rephrase to check I understood the plan correctly.

The idea is to pair 'bare' local_bh_{disable,enable} with local lock
and late make local_bh_{disable,enable} no ops (on RT).

With 'bare' I mean not followed by a spin_lock() - which is enough to
ensure mutual exclusion vs BH on RT build - am I correct?
I might have I misunderstood your rephrase. But to make it clear:
| $ git grep -p local_lock\( kernel/softirq.c
| kernel/softirq.c=void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
| kernel/softirq.c:                       local_lock(&softirq_ctrl.lock);

this is what I want to remove. This is upstream RT only (not RT queue
only). !RT builds are not affected by this change.
quoted
The series introduces the infrastructure and converts large parts of
networking which is largest stake holder here. Once this done the
per-CPU lock from local_bh_disable() on PREEMPT_RT can be lifted.
AFAICS there are a bunch of local_bh_* call-sites under 'net' matching
the above description and not addressed here. Is this series supposed
to cover 'net' fully?
The net subsystem has not been fully audited but the major parts have
been. I checked global per-CPU variables but there might be dynamic
ones. Also new ones might have appeared in the meantime. There are
two things which are not fixed yet that I am aware of:
- tw_timer timer
  https://lore.kernel.org/all/20240415113436.3261042-1-vschneid@redhat.com/T/#u (local)

- can gw
  https://lore.kernel.org/linux-can/20231031112349.y0aLoBrz@linutronix.de/ (local)
  https://lore.kernel.org/all/20231221123703.8170-1-socketcan@hartkopp.net/T/#u (local)

That means those two need to be fixed first before that local_local()
can disappear from local_bh_disable()/ enable. Also the whole tree
should be checked.
Could you please include the diffstat for the whole series? I
think/hope it will help catching the full picture more easily.
total over the series:

| include/linux/filter.h              | 134 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------
| include/linux/local_lock.h          |  21 +++++++++++++++++++++
| include/linux/local_lock_internal.h |  31 +++++++++++++++++++++++++++++++
| include/linux/lockdep.h             |   3 +++
| include/linux/netdevice.h           |  12 ++++++++++++
| include/linux/sched.h               |   9 ++++++++-
| include/net/seg6_local.h            |   1 +
| include/net/sock.h                  |   5 +++++
| kernel/bpf/cpumap.c                 |  27 +++++++++++----------------
| kernel/bpf/devmap.c                 |  16 ++++++++--------
| kernel/fork.c                       |   3 +++
| kernel/locking/spinlock.c           |   8 ++++++++
| net/bpf/test_run.c                  |  11 ++++++++++-
| net/bridge/br_netfilter_hooks.c     |   7 ++++++-
| net/core/dev.c                      |  39 +++++++++++++++++++++++++++++++++------
| net/core/dev.h                      |  20 ++++++++++++++++++++
| net/core/filter.c                   | 107 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------------------------------
| net/core/lwt_bpf.c                  |   9 +++++----
| net/core/skbuff.c                   |  25 ++++++++++++++++---------
| net/ipv4/tcp_ipv4.c                 |  15 +++++++++++----
| net/ipv4/tcp_sigpool.c              |  17 +++++++++++++----
| net/ipv6/seg6_local.c               |  22 ++++++++++++++--------
| net/xdp/xsk.c                       |  19 +++++++++++--------
| 23 files changed, 445 insertions(+), 116 deletions(-)
Note that some callers use local_bh_disable(), no additional lock, and
there is no specific struct to protect, but enforce explicit
serialization vs bh to a bunch of operation, e.g.  the
local_bh_disable() in inet_twsk_purge().

I guess such call site should be handled, too?
Yes but I didn't find much. inet_twsk_purge() is the first item from my
list. On RT spin_lock() vs spin_lock_bh() usage does not deadlock and
could be mixed.

The only resources that can be protected by disabling BH are per-CPU
resources. Either explicit defined (such as napi_alloc_cache) or
implicit by other means of per-CPU usage such as a CPU-bound timer,
worker, …. Protecting global variables by disabling BH is broken on SMP
(see the CAN gw example) so I am not too worried about those.
Unless you are aware of a category I did not think of.
Thanks!

Paolo
Sebastian
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help