Thread (21 messages) 21 messages, 4 authors, 2025-02-06

Re: [PATCH rcu v2] 4/5] rcu-tasks: Move RCU Tasks self-tests to core_initcall()

From: "Paul E. McKenney" <paulmck@kernel.org>
Date: 2025-02-05 14:50:53
Also in: lkml, rcu

On Tue, Feb 04, 2025 at 12:20:30PM -0800, Paul E. McKenney wrote:
quoted hunk ↗ jump to hunk
On Tue, Feb 04, 2025 at 05:34:09PM +0100, Sebastian Andrzej Siewior wrote:
quoted
On 2025-02-04 03:51:48 [-0800], Paul E. McKenney wrote:
quoted
On Tue, Feb 04, 2025 at 11:26:11AM +0100, Sebastian Andrzej Siewior wrote:
quoted
On 2025-01-30 10:53:19 [-0800], Paul E. McKenney wrote:
quoted
The timer and hrtimer softirq processing has moved to dedicated threads
for kernels built with CONFIG_IRQ_FORCED_THREADING=y.  This results in
timers not expiring until later in early boot, which in turn causes the
RCU Tasks self-tests to hang in kernels built with CONFIG_PROVE_RCU=y,
which further causes the entire kernel to hang.  One fix would be to
make timers work during this time, but there are no known users of RCU
Tasks grace periods during that time, so no justification for the added
complexity.  Not yet, anyway.

This commit therefore moves the call to rcu_init_tasks_generic() from
kernel_init_freeable() to a core_initcall().  This works because the
timer and hrtimer kthreads are created at early_initcall() time.
Fixes: 49a17639508c3 ("softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.")
?
Quite possibly...  I freely confess that I was more focused on the fix
than on the bug's origin.  Would you be willing to try this commit and
its predecessor?
Yes. Just verified.
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Boqun, could you please apply Sebastian's tags, including the Fixes
tag above?
quoted
quoted
quoted
I played with it and I can reproduce the issue with !RT + threadirqs but
not with RT (which implies threadirqs).
Is there anything in RT that avoids the problem?
Not that I know of, but then again I did not try it.  To your point,
The change looks fine.
quoted
I do need to make a -rt rcutorture scenario.  TREE03 has been intended to
approximate this, and it uses the following Kconfig options:

------------------------------------------------------------------------

CONFIG_SMP=y
CONFIG_NR_CPUS=16
CONFIG_PREEMPT_NONE=n
CONFIG_PREEMPT_VOLUNTARY=n
CONFIG_PREEMPT=y
#CHECK#CONFIG_PREEMPT_RCU=y
CONFIG_HZ_PERIODIC=y
CONFIG_NO_HZ_IDLE=n
CONFIG_NO_HZ_FULL=n
CONFIG_RCU_TRACE=y
CONFIG_HOTPLUG_CPU=y
CONFIG_RCU_FANOUT=2
CONFIG_RCU_FANOUT_LEAF=2
CONFIG_RCU_NOCB_CPU=n
CONFIG_DEBUG_LOCK_ALLOC=n
CONFIG_RCU_BOOST=y
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
CONFIG_RCU_EXPERT=y
You could enable CONFIG_PREEMPT_RT ;)
CONFIG_PREEMPT_LAZY is probably also set a lot.

That should be it.
quoted
------------------------------------------------------------------------

And the following kernel-boot parameters:

------------------------------------------------------------------------

rcutorture.onoff_interval=200 rcutorture.onoff_holdoff=30
rcutree.gp_preinit_delay=12
rcutree.gp_init_delay=3
rcutree.gp_cleanup_delay=3
rcutree.kthread_prio=2
threadirqs
rcutree.use_softirq=0
rcutorture.preempt_duration=10

------------------------------------------------------------------------

Some of these are for RCU's benefit, but what should I change to more
closely approximate a typical real-time deployment?
See above.
Which got me this diff:

------------------------------------------------------------------------
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE03 b/tools/testing/selftests/rcutorture/configs/rcu/TREE03
index 2dc31b16e506..6158f5002497 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE03
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE03
@@ -2,7 +2,9 @@ CONFIG_SMP=y
 CONFIG_NR_CPUS=16
 CONFIG_PREEMPT_NONE=n
 CONFIG_PREEMPT_VOLUNTARY=n
-CONFIG_PREEMPT=y
+CONFIG_PREEMPT=n
+CONFIG_PREEMPT_LAZY=y
+CONFIG_PREEMPT_RT=y
 #CHECK#CONFIG_PREEMPT_RCU=y
 CONFIG_HZ_PERIODIC=y
 CONFIG_NO_HZ_IDLE=n
@@ -15,4 +17,5 @@ CONFIG_RCU_NOCB_CPU=n
 CONFIG_DEBUG_LOCK_ALLOC=n
 CONFIG_RCU_BOOST=y
 CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_EXPERT=y
 CONFIG_RCU_EXPERT=y

------------------------------------------------------------------------
But a 10-minute run got me the splat shown below, and in addition a
shutdown-time hang.

This is caused by RCU falling behind a callback-flooding kthread that
invokes call_rcu() in a semi-tight loop.  Setting rcutree.kthread_prio=40
avoids the splat, but still gets the shutdown-time hang.  Retrying with
the default rcutree.kthread_prio=2 failed to reproduce the splat, but
it did reproduce the shutdown-time hang.

OK, maybe printk buffers are not being flushed?  A 100-millisecond sleep
at the end of of rcu_torture_cleanup() got all of rcutorture's output
flushed, but lost the subsequent shutdown-time console traffic.  The
pr_flush(HZ/10,1) seems more sensible, but this is private to printk().

I would like to log the shutdown-time console traffic because RCU can
sometimes break things on that path.

Thoughts?
Longer rcutorture runs showed (not unexpectedly) that the 100-millisecond
sleep was not always sufficient, nor was a 500-milliseconds sleep.

There is a call to kmsg_dump(KMSG_DUMP_SHUTDOWN) in kernel_power_off()
that appears to be intended to dump out the printk() buffers, but it
does not seem to do so in kernels built with CONFIG_PREEMPT_RT=y.
Does there need to be a pr_flush() call prior to the call to
migrate_to_reboot_cpu()?  Or maybe even to do_kernel_power_off_prepare()
or kernel_shutdown_prepare()?

Adding John Ogness on CC so that he can tell me the error of my ways.
PS:  I will do longer runs in case that splat was not a one-off.
     My concern is that I might need to adjust something more in order
     to get a reliable callback-flooding test.
And this was not a one-off.  Running 10 40-minute instances of the new-age
CONFIG_PREEMPT_RT=y TREE03 reliably triggers this.  At first glance,
this appears to be an interaction between testing of RCU priority
boosting and RCU-callback flooding forward-progress testing.  And disabling
testing of RCU priority boosting avoids these OOMs.  As does running
without CONFIG_PREEMPT_RT=y.

My next step is to run with rcutorture.preempt_duration=0, which disables
within-guest-OS random preempting of kthreads.  If that doesn't help,
I expect to play around with avoiding concurrent testing of RCU priority
boosting and RCU callback flooding forward progress.

Or is there a better way?

							Thanx, Paul
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help