Thread (18 messages) 18 messages, 5 authors, 2023-07-06

Re: [PATCH linux-next][RFC]torture: avoid offline tick_do_timer_cpu

From: Zhouyi Zhou <hidden>
Date: 2022-11-27 02:45:52
Also in: lkml

Thank Thomas for your guidance

On Sun, Nov 27, 2022 at 1:05 AM Thomas Gleixner [off-list ref] wrote:
On Mon, Nov 21 2022 at 11:51, Zhouyi Zhou wrote:
quoted
During CPU-hotplug torture (CONFIG_NO_HZ_FULL=y), if we try to
offline tick_do_timer_cpu, the operation will fail because in
function tick_nohz_cpu_down:
if (tick_nohz_full_running && tick_do_timer_cpu == cpu)
      return -EBUSY;
Above bug was first discovered in torture tests performed in PPC VM
How is this a bug?
Yes, this is a false positive instead.
quoted
of Open Source Lab of Oregon State University, and reproducable in RISC-V
and X86-64 (with additional kernel commandline cpu0_hotplug).

In this patch, we avoid offline tick_do_timer_cpu by distribute
the offlining cpu among remaining cpus.
Please read Documentation/process. Search for 'this patch'...
Documentation/process/submitting-patches.rst says:
"Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
to do frotz", as if you are giving orders to the codebase to change
its behaviour."

So, I should construct my patch as:
We avoid ... by ...
quoted
Signed-off-by: Zhouyi Zhou <redacted>
---
 include/linux/tick.h        |  1 +
 kernel/time/tick-common.c   |  1 +
 kernel/time/tick-internal.h |  1 -
 kernel/torture.c            | 10 ++++++++++
 4 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/include/linux/tick.h b/include/linux/tick.h
index bfd571f18cfd..23cc0b205853 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -14,6 +14,7 @@
 #include <linux/rcupdate.h>

 #ifdef CONFIG_GENERIC_CLOCKEVENTS
+extern int tick_do_timer_cpu __read_mostly;
 extern void __init tick_init(void);
 /* Should be core only, but ARM BL switcher requires it */
 extern void tick_suspend_local(void);
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 46789356f856..87b9b9afa320 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -48,6 +48,7 @@ ktime_t tick_next_period;
  *    procedure also covers cpu hotplug.
  */
 int tick_do_timer_cpu __read_mostly = TICK_DO_TIMER_BOOT;
+EXPORT_SYMBOL_GPL(tick_do_timer_cpu);
No. We are not exporting this just to make a bogus test case happy.

Fix the torture code to handle -EBUSY correctly.
I am going to do a study on this, for now, I do a grep in the kernel tree:
find . -name "*.c"|xargs grep cpuhp_setup_state|wc -l
The result of the grep command shows that there are 268
cpuhp_setup_state* cases.
which may make our task more complicated.

After my study, should we also take Frederic's proposal as a possible option?
(construct a function for this)
https://lore.kernel.org/lkml/20221123223658.GC1395324@lothringen/ (local)

I learned a lot during this process
Many thanks
Zhouyi
Thanks,

        tglx
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help