Re: [PATCH 6/7] smp: Don't yell about IRQs disabled in kgdb_roundup_cpus()
From: Doug Anderson <dianders@chromium.org>
Date: 2018-10-30 22:21:48
Also in:
linux-arm-kernel, linux-arm-msm, linux-mips, linux-sh, lkml, sparclinux
Daniel, On Tue, Oct 30, 2018 at 2:42 AM Daniel Thompson [off-list ref] wrote:
On Mon, Oct 29, 2018 at 11:07:06AM -0700, Douglas Anderson wrote:quoted
In kgdb_roundup_cpus() we've got code that looks like: local_irq_enable(); smp_call_function(kgdb_call_nmi_hook, NULL, 0); local_irq_disable(); In certain cases when we drop into kgdb (like with sysrq-g on a serial console) we'll get a big yell that looks like: sysrq: SysRq : DEBUG ------------[ cut here ]------------ DEBUG_LOCKS_WARN_ON(current->hardirq_context) WARNING: CPU: 0 PID: 0 at .../kernel/locking/lockdep.c:2875 lockdep_hardirqs_on+0xf0/0x160 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0 #27 pstate: 604003c9 (nZCv DAIF +PAN -UAO) pc : lockdep_hardirqs_on+0xf0/0x160 ... Call trace: lockdep_hardirqs_on+0xf0/0x160 trace_hardirqs_on+0x188/0x1ac kgdb_roundup_cpus+0x14/0x3c kgdb_cpu_enter+0x53c/0x5cc kgdb_handle_exception+0x180/0x1d4 kgdb_compiled_brk_fn+0x30/0x3c brk_handler+0x134/0x178 do_debug_exception+0xfc/0x178 el1_dbg+0x18/0x78 kgdb_breakpoint+0x34/0x58 sysrq_handle_dbg+0x54/0x5c __handle_sysrq+0x114/0x21c handle_sysrq+0x30/0x3c qcom_geni_serial_isr+0x2dc/0x30c ... ... irq event stamp: ...45 hardirqs last enabled at (...44): [...] __do_softirq+0xd8/0x4e4 hardirqs last disabled at (...45): [...] el1_irq+0x74/0x130 softirqs last enabled at (...42): [...] _local_bh_enable+0x2c/0x34 softirqs last disabled at (...43): [...] irq_exit+0xa8/0x100 ---[ end trace adf21f830c46e638 ]--- Let's add kgdb to the list of reasons not to warn in smp_call_function_many(). That will allow us (in a future patch) to stop calling local_irq_enable() which will get rid of the original splat. NOTE: with this change comes the obvious question: will we start deadlocking more often now when we drop into the debugger. I can't say that for sure one way or the other, but the fact that we do the same logic for "oops_in_progress" makes me feel like it shouldn't happen too often. Also note that the old logic of turning on interrupts temporarily wasn't exactly safe since (I presume) that could have violated spin_lock_irqsave() semantics and ended up with a deadlock of its own.This is part of the code to bring all the cores to a halt and since the other cores are still running kgdb isn't yet able to use the fact all the CPUs are halted to bend the rules. It is better for this code to play by the rules if it can. Is is possible to get the roundup functions to use a private csd alongside smp_call_function_single_async()? We could add a helper function to the debug core to avoid having to add cpu_online loops into every kgdb_roundup_cpus() implementaton.
Exactly the kind of helpful suggestion I was looking for. Thank you very much! See v2 and hopefully it matches what you were thinking of. -Doug