Thread (37 messages) 37 messages, 11 authors, 2021-06-11

Re: [PATCH bpf-next 02/17] bpf: allow RCU-protected lookups to happen from bh context

From: Daniel Borkmann <daniel@iogearbox.net>
Date: 2021-06-10 21:24:25
Also in: netdev

Hi Paul,

On 6/10/21 8:38 PM, Alexei Starovoitov wrote:
On Wed, Jun 9, 2021 at 7:24 AM Toke Høiland-Jørgensen [off-list ref] wrote:
quoted
XDP programs are called from a NAPI poll context, which means the RCU
reference liveness is ensured by local_bh_disable(). Add
rcu_read_lock_bh_held() as a condition to the RCU checks for map lookups so
lockdep understands that the dereferences are safe from inside *either* an
rcu_read_lock() section *or* a local_bh_disable() section. This is done in
preparation for removing the redundant rcu_read_lock()s from the drivers.

Signed-off-by: Toke Høiland-Jørgensen <redacted>
---
  kernel/bpf/hashtab.c  | 21 ++++++++++++++-------
  kernel/bpf/helpers.c  |  6 +++---
  kernel/bpf/lpm_trie.c |  6 ++++--
  3 files changed, 21 insertions(+), 12 deletions(-)
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 6f6681b07364..72c58cc516a3 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -596,7 +596,8 @@ static void *__htab_map_lookup_elem(struct bpf_map *map, void *key)
         struct htab_elem *l;
         u32 hash, key_size;

-       WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held());
+       WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held() &&
+                    !rcu_read_lock_bh_held());
It's not clear to me whether rcu_read_lock_held() is still needed.
All comments sound like rcu_read_lock_bh_held() is a superset of rcu
that includes bh.
But reading rcu source code it looks like RCU_BH is its own rcu flavor...
which is confusing.
The series is a bit confusing to me as well. I recall we had a discussion with
Paul, but it was back in 2016 aka very early days of XDP to get some clarifications
about RCU vs RCU-bh flavour on this. Paul, given the series in here, I assume the
below is not true anymore, and in this case (since we're removing rcu_read_lock()
from drivers), the RCU-bh acts as a real superset?

Back then from your clarifications this was not the case:

   On Mon, Jul 25, 2016 at 11:26:02AM -0700, Alexei Starovoitov wrote:
   > On Mon, Jul 25, 2016 at 11:03 AM, Paul E. McKenney
   > [off-list ref] wrote:
   [...]
   >>> The crux of the question is whether a particular driver rx handler, when
   >>> called from __do_softirq, needs to add an additional rcu_read_lock or
   >>> whether it can rely on the mechanics of softirq.
   >>
   >> If it was rcu_read_lock_bh(), you could.
   >>
   >> But you didn't say rcu_read_lock_bh(), you instead said rcu_read_lock(),
   >> which means that you absolutely cannot rely on softirq semantics.
   >>
   >> In particular, in CONFIG_PREEMPT=y kernels, rcu_preempt_check_callbacks()
   >> will notice that there is no rcu_read_lock() in effect and report
   >> a quiescent state for that CPU.  Because rcu_preempt_check_callbacks()
   >> is invoked from the scheduling-clock interrupt, it absolutely can
   >> execute during do_softirq(), and therefore being in softirq context
   >> in no way provides rcu_read_lock()-style protection.
   >>
   >> Now, Alexei's question was for CONFIG_PREEMPT=n kernels.  However, in
   >> that case, rcu_read_lock() and rcu_read_unlock() generate no code
   >> in recent production kernels, so there is no performance penalty for
   >> using them.  (In older kernels, they implied a barrier().)
   >>
   >> So either way, with or without CONFIG_PREEMPT, you should use
   >> rcu_read_lock() to get RCU protection.
   >>
   >> One alternative might be to switch to rcu_read_lock_bh(), but that
   >> will add local_disable_bh() overhead to your read paths.
   >>
   >> Does that help, or am I missing the point of the question?
   >
   > thanks a lot for explanation.

   Glad you liked it!

   > I mistakenly assumed that _bh variants are 'stronger' and
   > act as inclusive, but sounds like they're completely orthogonal
   > especially with preempt_rcu=y.

   Yes, they are pretty much orthogonal.

   > With preempt_rcu=n and preempt=y, it would be the case, since
   > bh disables preemption and rcu_read_lock does the same as well,
   > right? Of course, the code shouldn't be relying on that, so we
   > have to fix our stuff.

   Indeed, especially given that the kernel currently won't allow you
   to configure CONFIG_PREEMPT_RCU=n and CONFIG_PREEMPT=y.  If it does,
   please let me know, as that would be a bug that needs to be fixed.
   (For one thing, I do not test that combination.)

							Thanx, Paul

And now, fast-forward again to 2021 ... :)

Thanks,
Daniel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help