Re: [PATCH] tracing/osnoise: Fix possible recursive locking for cpus_read_lock()
From: Ran Xiaokai <hidden>
Date: 2025-02-26 03:43:13
Also in:
lkml
On Tue, 25 Feb 2025 12:31:32 +0000 Ran Xiaokai [off-list ref] wrote:quoted
@@ -2097,7 +2096,7 @@ static void osnoise_hotplug_workfn(structwork_struct *dummy) return; guard(mutex)(&interface_lock); - guard(cpus_read_lock)(); + cpus_read_lock(); if (!cpu_online(cpu)) return;This is buggy. You removed the guard, and right below we have an error exit that will leave this function without unlocking the cpus_read_lock().
Indeed. I will run the LTP cpu-hotplug testcases before the next verion.
quoted
@@ -2105,7 +2104,12 @@ static void osnoise_hotplug_workfn(structwork_struct *dummy) if (!cpumask_test_cpu(cpu, &osnoise_cpumask)) return; - start_kthread(cpu); + if (start_kthread(cpu)) { + cpus_read_unlock(); + stop_per_cpu_kthreads(); + return;If all you want to do is to unlock before calling stop_per_cpu_kthreads(), then this should simply be: if (start_kthread(cpu)) { cpus_read_unlock(); stop_per_cpu_kthreads(); cpus_read_lock(); // The guard() above will unlock this return; }
This is the deadlock senario:
start_per_cpu_kthreads()
cpus_read_lock(); // first lock call
start_kthread(cpu)
... kthread_run_on_cpu() fails:
if (IS_ERR(kthread)) {
stop_per_cpu_kthreads(); {
cpus_read_lock(); // second lock call. Cause the AA deadlock senario
}
}
stop_per_cpu_kthreads();
Besides, stop_per_cpu_kthreads() is called both in start_per_cpu_kthreads() and
start_kthread() which is unnecessary.
So the fix should be inside start_kthread()?
How about this ?
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c@@ -2029,7 +2029,9 @@ static int start_kthread(unsigned int cpu) if (IS_ERR(kthread)) { pr_err(BANNER "could not start sampling thread\n"); + cpus_read_unlock(); stop_per_cpu_kthreads(); + cpus_read_lock(); return -ENOMEM; }
@@ -2076,7 +2078,6 @@ static int start_per_cpu_kthreads(void) retval = start_kthread(cpu); if (retval) { cpus_read_unlock(); - stop_per_cpu_kthreads(); return retval; } }
But I still have to verify that this is indeed the issue here. -- Stevequoted
+ } + cpus_read_unlock(); } static DECLARE_WORK(osnoise_hotplug_work, osnoise_hotplug_workfn);