Re: [PATCH 5/5] tracing/hwlat: Fix deadlock in cpuhp processing
From: Steven Rostedt <rostedt@goodmis.org>
Date: 2024-11-12 23:50:42
On Wed, 9 Oct 2024 15:47:23 +0800 "liwei (GF)" [off-list ref] wrote:
On 2024/10/4 4:19, Steven Rostedt wrote:quoted
On Tue, 24 Sep 2024 17:45:15 +0800 Wei Li [off-list ref] wrote:quoted
Another "hung task" error was reported during the test, and i figured out the deadlock scenario is as follows: T1 [BP] | T2 [AP] | T3 [hwlatd/1] | T4 work_for_cpu_fn() | cpuhp_thread_fun() | kthread_fn() | hwlat_hotplug_workfn() _cpu_down() | stop_cpu_kthread() | | mutex_lock(&hwlat_data.lock) cpus_write_lock() | kthread_stop(hwlatd/1) | mutex_lock(&hwlat_data.lock) | __cpuhp_kick_ap() | wait_for_completion() | | cpus_read_lock()
So, if we can make T3 not take the mutex_lock then that should be a solution, right?
quoted
quoted
It constitutes ABBA deadlock indirectly beAs it calls msleep_interruptible() and 'break' if signal pending below, i choosed 'break' here too.tween "cpu_hotplug_lock" and "hwlat_data.lock", make the mutex obtaining in kthread_fn() interruptible to fix this. Fixes: ba998f7d9531 ("trace/hwlat: Support hotplug operations") Signed-off-by: Wei Li <redacted> --- kernel/trace/trace_hwlat.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)diff --git a/kernel/trace/trace_hwlat.c b/kernel/trace/trace_hwlat.c index 3bd6071441ad..4c228ccb8a38 100644 --- a/kernel/trace/trace_hwlat.c +++ b/kernel/trace/trace_hwlat.c@@ -370,7 +370,8 @@ static int kthread_fn(void *data) get_sample(); local_irq_enable(); - mutex_lock(&hwlat_data.lock); + if (mutex_lock_interruptible(&hwlat_data.lock)) + break;So basically this requires as signal to break it out of the loop? But if it receives a signal for any other reason, it breaks out of the loop too. Which is not what we want. If anything, it should be: if (mutex_lock_interruptible(&hwlat_data.lock)) continue;As it calls msleep_interruptible() below and 'break' if signal pending, i choosed 'break' here too.quoted
But I still don't really like this solution, as it will still report a deadlock. Is it possible to switch the cpu_read_lock() to be taken before the hwlat_data.lock?It's a little hard to change the sequence of these two locks, we'll hold "cpu_hotplug_lock" for longer unnecessarily if we do that. But maybe we can remove the "hwlat_data.lock" in kthread_fn(), let me try another modification.
Have you found something yet? Looking at the code we have:
mutex_lock(&hwlat_data.lock);
interval = hwlat_data.sample_window - hwlat_data.sample_width;
mutex_unlock(&hwlat_data.lock);
Where the lock is only there to synchronize the calculation of the
interval. We could add a counter for when sample_window and sample_width
are updated, and we could simply do:
again:
counter = atomic_read(&hwlat_data.counter);
smp_rmb();
if (!(counter & 1)) {
new_interval = hwlat_data.sample_window - hwlat_data.sample_width;
smp_rmb();
if (counter == atomic_read(&hwlat_data.counter))
interval = new_interval;
}
Then we could do something like:
atomic_inc(&hwlat_data.counter);
smp_wmb();
/* update sample_window or sample_width */
smp_wmb();
atomic_inc(&hwlat_data.counter);
And then the interval will only be updated if the values are not being
updated. Otherwise it just keeps the previous value.
-- Steve