Thread (4 messages) 4 messages, 3 authors, 2023-03-09

Re: [PATCH] trace/hwlat: Do not restart per-cpu threads if they are already running

From: Daniel Bristot de Oliveira <bristot@kernel.org>
Date: 2023-03-02 11:50:05
Also in: lkml

Hi Tero,

On 3/2/23 08:36, Tero Kristo wrote:
Check if the hwlatd thread for the cpu is already running, before
starting a new one. This avoids running multiple instances of the same
CPU thread on the system. Also, do not wipe the contents of the
per-cpu kthread data when starting the tracer, as this can completely
forget about already running instances and start new additional per-cpu
threads. Fixes issues where fiddling with either the mode of the hwlat
tracer or doing cpu-hotplugs messes up the internal book-keeping
resulting in stale hwlatd threads.
Thanks for your patch.

Would you mind explaining how do you hit the problem? that is, how can
I reproduce the same problem you faced.

I tried reproducing it by dispatching the hwlat tracer in two instances,
but the system already blocks me...

[root@vm tracing]# echo hwlat > current_tracer 
[root@vm tracing]# cd instances/
[root@vm instances]# mkdir hwlat_2
[root@vm instances]# cd hwlat_2/
[root@vm hwlat_2]# echo hwlat > current_tracer 
-bash: echo: write error: Device or resource busy

[root@vm hwlat_2]# cd ../../
[root@vm tracing]# echo nop > current_tracer 
[root@vm tracing]# cd instances/hwlat_2/
[root@vm hwlat_2]# echo hwlat > current_tracer 
[root@vm hwlat_2]# cd ..
[root@vm instances]# mkdir hwlat_1
[root@vm instances]# cd hwlat_1/
[root@vm hwlat_1]# echo hwlat > current_tracer 
-bash: echo: write error: Device or resource busy
[root@vm hwlat_1]# 

Having a reproducer helps us to think better about the problem.

-- Daniel
quoted hunk ↗ jump to hunk
Signed-off-by: Tero Kristo <redacted>
---
 kernel/trace/trace_hwlat.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/kernel/trace/trace_hwlat.c b/kernel/trace/trace_hwlat.c
index d440ddd5fd8b..c4945f8adc11 100644
--- a/kernel/trace/trace_hwlat.c
+++ b/kernel/trace/trace_hwlat.c
@@ -492,6 +492,10 @@ static int start_cpu_kthread(unsigned int cpu)
 {
 	struct task_struct *kthread;
 
+	/* Do not start a new hwlatd thread if it is already running */
+	if (per_cpu(hwlat_per_cpu_data, cpu).kthread)
+		return 0;
+
 	kthread = kthread_run_on_cpu(kthread_fn, NULL, cpu, "hwlatd/%u");
 	if (IS_ERR(kthread)) {
 		pr_err(BANNER "could not start sampling thread\n");
@@ -584,9 +588,6 @@ static int start_per_cpu_kthreads(struct trace_array *tr)
 	 */
 	cpumask_and(current_mask, cpu_online_mask, tr->tracing_cpumask);
 
-	for_each_online_cpu(cpu)
-		per_cpu(hwlat_per_cpu_data, cpu).kthread = NULL;
-
 	for_each_cpu(cpu, current_mask) {
 		retval = start_cpu_kthread(cpu);
 		if (retval)
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help