Re: [PATCH] rcu: Add missing unlock in rcu_print_task_stall
From: "Paul E. McKenney" <paulmck@kernel.org>
Date: 2021-07-27 15:52:19
Also in:
linux-mediatek, lkml, rcu
On Tue, Jul 27, 2021 at 03:45:42PM +0800, Cheng Jui Wang wrote:
We encouterd a deadlock with following lockdep warning. The
rcu_print_task_stall is supposed to release rnp->lock, but may just
return without unlock.
if (!rcu_preempt_blocked_readers_cgp(rnp))
return 0;
Add missing unlock before return to fix it.
============================================
WARNING: possible recursive locking detected
5.10.43
--------------------------------------------
swapper/7/0 is trying to acquire lock:
ffffffc01268c018 (rcu_node_0){-.-.}-{2:2}, at: rcu_dump_cpu_stacks+0x94/0x138
but task is already holding lock:
ffffffc01268c018 (rcu_node_0){-.-.}-{2:2}, at: check_cpu_stall+0x34c/0x6f8
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(rcu_node_0);
lock(rcu_node_0);
*** DEADLOCK ***
May be due to missing lock nesting notation
1 lock held by swapper/7/0:
#0: ffffffc01268c018 (rcu_node_0){-.-.}-{2:2}, at: check_cpu_stall+0x34c/0x6f8
stack backtrace:
CPU: 7 PID: 0 Comm: swapper/7
Call trace:
dump_backtrace.cfi_jt+0x0/0x8
show_stack+0x1c/0x2c
dump_stack_lvl+0xd8/0x16c
validate_chain+0x2124/0x2d34
__lock_acquire+0x7e4/0xed4
lock_acquire+0x114/0x394
_raw_spin_lock_irqsave+0x88/0xd4
rcu_dump_cpu_stacks+0x94/0x138
check_cpu_stall+0x498/0x6f8
rcu_sched_clock_irq+0xd4/0x214
update_process_times+0xb4/0xf4
tick_sched_timer+0x98/0x110
__hrtimer_run_queues+0x19c/0x2bc
hrtimer_interrupt+0x10c/0x3a8
arch_timer_handler_phys+0x5c/0x98
handle_percpu_devid_irq+0xe0/0x2a8
__handle_domain_irq+0xd0/0x19c
gic_handle_irq+0x6c/0x134
el1_irq+0xe0/0x1c0
arch_cpu_idle+0x1c/0x30
default_idle_call+0x58/0xcc
do_idle.llvm.13807299673429836468+0x118/0x2e8
cpu_startup_entry+0x28/0x2c
secondary_start_kernel+0x1d0/0x23c
Signed-off-by: Cheng Jui Wang <redacted>
Good catch, thank you!
However, Yanfei Xu beat you to this with commit f6b3995a8b56dc ("rcu:
Fix stall-warning deadlock due to non-release of rcu_node ->lock"),
which is in -rcu and slated for the upcoming merge window.
His commit 8baded711edc ("rcu: Fix to include first blocked task in
stall warning") might also be of interest to you.
Thanx, Paul
quoted hunk
--- kernel/rcu/tree_stall.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index 6c76988cc019..3dc464d4d9a5 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h@@ -267,8 +267,10 @@ static int rcu_print_task_stall(struct rcu_node *rnp, unsigned long flags) struct task_struct *ts[8]; lockdep_assert_irqs_disabled(); - if (!rcu_preempt_blocked_readers_cgp(rnp)) + if (!rcu_preempt_blocked_readers_cgp(rnp)) { + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); return 0; + } pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):", rnp->level, rnp->grplo, rnp->grphi); t = list_entry(rnp->gp_tasks->prev,-- 2.18.0
_______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel