Thread (7 messages) 7 messages, 3 authors, 2025-03-17

Re: [PATCH] tracing/osnoise: Fix possible recursive locking for cpus_read_lock()

From: Vishal Chourasia <hidden>
Date: 2025-02-27 07:40:23
Also in: lkml

Hi,

On Tue, Feb 25, 2025 at 12:31:32PM +0000, Ran Xiaokai wrote:
quoted hunk ↗ jump to hunk
From: Ran Xiaokai <redacted>

Lockdep reports this deadlock log:
============================================
WARNING: possible recursive locking detected
--------------------------------------------
sh/31444 is trying to acquire lock:
ffffffff82c51af0 (cpu_hotplug_lock){++++}-{0:0}, at:
stop_per_cpu_kthreads+0x7/0x60

but task is already holding lock:
ffffffff82c51af0 (cpu_hotplug_lock){++++}-{0:0}, at:
start_per_cpu_kthreads+0x28/0x140

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(cpu_hotplug_lock);
  lock(cpu_hotplug_lock);

Call Trace:
 <TASK>
 __lock_acquire+0x1612/0x29b0
 lock_acquire+0xd0/0x2e0
 cpus_read_lock+0x49/0x120
 stop_per_cpu_kthreads+0x7/0x60
 start_kthread+0x105/0x120
 start_per_cpu_kthreads+0xdd/0x140
 osnoise_workload_start+0x261/0x2f0
 osnoise_tracer_start+0x18/0x4

In start_kthread(), when kthread_run_on_cpu() fails,
cpus_read_unlock() should be called before stop_per_cpu_kthreads(),
but both start_per_cpu_kthreads() and start_kthread() call the error
handling routine stop_per_cpu_kthreads(),
which is redundant. Only one call is necessary.
To fix this, move stop_per_cpu_kthreads() outside of start_kthread(),
use the return value of start_kthread() to determine kthread creation
error.
The same issue exists in osnoise_hotplug_workfn() too.

Reviewed-by: Yang Guang <redacted>
Reviewed-by: Wang Yong <redacted>
Signed-off-by: Ran Xiaokai <redacted>
---
 kernel/trace/trace_osnoise.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 92e16f03fa4e..38fb0c655f5b 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -2029,7 +2029,6 @@ static int start_kthread(unsigned int cpu)
 
 	if (IS_ERR(kthread)) {
 		pr_err(BANNER "could not start sampling thread\n");
-		stop_per_cpu_kthreads();
 		return -ENOMEM;
 	}
 
@@ -2097,7 +2096,7 @@ static void osnoise_hotplug_workfn(struct
work_struct *dummy)
 		return;
 
 	guard(mutex)(&interface_lock);
-	guard(cpus_read_lock)();
+	cpus_read_lock();
 
 	if (!cpu_online(cpu))
 		return;
@@ -2105,7 +2104,12 @@ static void osnoise_hotplug_workfn(struct
work_struct *dummy)
 	if (!cpumask_test_cpu(cpu, &osnoise_cpumask))
 		return;
 
-	start_kthread(cpu);
+	if (start_kthread(cpu)) {
+		cpus_read_unlock();
+		stop_per_cpu_kthreads();
Is it right to call stop_per_cpu_kthreads() which stops osnoise kthread
for every other CPUs in the system if a failure occurs during hotplug of a
CPU?

On another note, since stop_per_cpu_kthreads() invokes stop_kthread()
for every online CPU. It's better to remove stop_per_cpu_kthreads() from
start_kthread() and handle the error in `osnoise_hotplug_workfn` 

	Vishal
+		return;
+	}
+	cpus_read_unlock();
 }
 
 static DECLARE_WORK(osnoise_hotplug_work, osnoise_hotplug_workfn);
-- 
2.15.2
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help