Thread (7 messages) 7 messages, 2 authors, 2021-09-23

Re: rcu/tree: Protect rcu_rdp_is_offloaded() invocations on RT

From: Frederic Weisbecker <frederic@kernel.org>
Date: 2021-09-21 23:45:23
Also in: linux-rt-users, lkml, rcu

Possibly related (same subject, not in this thread)

On Tue, Sep 21, 2021 at 11:12:50PM +0200, Thomas Gleixner wrote:
Valentin reported warnings about suspicious RCU usage on RT kernels. Those
happen when offloading of RCU callbacks is enabled:

  WARNING: suspicious RCU usage
  5.13.0-rt1 #20 Not tainted
  -----------------------------
  kernel/rcu/tree_plugin.h:69 Unsafe read of RCU_NOCB offloaded state!

  rcu_rdp_is_offloaded (kernel/rcu/tree_plugin.h:69 kernel/rcu/tree_plugin.h:58)
  rcu_core (kernel/rcu/tree.c:2332 kernel/rcu/tree.c:2398 kernel/rcu/tree.c:2777)
  rcu_cpu_kthread (./include/linux/bottom_half.h:32 kernel/rcu/tree.c:2876)

The reason is that rcu_rdp_is_offloaded() is invoked without one of the
required protections on RT enabled kernels because local_bh_disable() does
not disable preemption on RT.

Valentin proposed to add a local lock to the code in question, but that's
suboptimal in several aspects:

  1) local locks add extra code to !RT kernels for no value.

  2) All possible callsites have to audited and amended when affected
     possible at an outer function level due to lock nesting issues.

  3) As the local lock has to be taken at the outer functions it's required
     to release and reacquire them in the inner code sections which might
     voluntary schedule, e.g. rcu_do_batch().

Both callsites of rcu_rdp_is_offloaded() which trigger this check invoke
rcu_rdp_is_offloaded() in the variable declaration section right at the top
of the functions. But the actual usage of the result is either within a
section which provides the required protections or after such a section.

So the obvious solution is to move the invocation into the code sections
which provide the proper protections, which solves the problem for RT and
does not have any impact on !RT kernels.
Also while at it, I'm asking again: traditionally softirqs could assume that
manipulating a local state was safe against !irq_count() code fiddling with
the same state on the same CPU.

Now with preemptible softirqs, that assumption can be broken anytime. RCU was
fortunate enough to have a warning for that. But who knows how many issues like
this are lurking?

Thanks.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help