Re: [PATCH 3/3] arm64: debug: Remove rcu_read_lock from debug exception
From: Masami Hiramatsu <mhiramat@kernel.org>
Date: 2019-07-21 01:58:49
Also in:
lkml
On Sat, 20 Jul 2019 16:32:32 +0900 Masami Hiramatsu [off-list ref] wrote:
Hi James, On Fri, 19 Jul 2019 09:42:05 +0100 James Morse [off-list ref] wrote:quoted
Hi, On 7/18/19 3:31 PM, Masami Hiramatsu wrote:quoted
On Thu, 18 Jul 2019 10:20:23 +0100 Mark Rutland [off-list ref] wrote:quoted
On Wed, Jul 17, 2019 at 11:22:15PM -0700, Paul E. McKenney wrote:quoted
On Thu, Jul 18, 2019 at 02:43:58PM +0900, Masami Hiramatsu wrote:quoted
Remove rcu_read_lock()/rcu_read_unlock() from debug exception handlers since the software breakpoint can be hit on idle task.Why precisely do we need to elide these? Are we seeing warnings today?Yes, unfortunately, or fortunately. Naresh reported that warns when ftracetest ran. I confirmed that happens if I probe on default_idle_call too. /sys/kernel/debug/tracing # echo p default_idle_call >> kprobe_events /sys/kernel/debug/tracing # echo 1 > events/kprobes/enable /sys/kernel/debug/tracing # [ 135.122237] [ 135.125035] ============================= [ 135.125310] WARNING: suspicious RCU usagequoted
[ 135.132224] Call trace: [ 135.132491] dump_backtrace+0x0/0x140 [ 135.132806] show_stack+0x24/0x30 [ 135.133133] dump_stack+0xc4/0x10c [ 135.133726] lockdep_rcu_suspicious+0xf8/0x108 [ 135.134171] call_break_hook+0x170/0x178 [ 135.134486] brk_handler+0x28/0x68 [ 135.134792] do_debug_exception+0x90/0x150 [ 135.135051] el1_dbg+0x18/0x8c [ 135.135260] default_idle_call+0x0/0x44 [ 135.135516] cpu_startup_entry+0x2c/0x30 [ 135.135815] rest_init+0x1b0/0x280 [ 135.136044] arch_call_rest_init+0x14/0x1c [ 135.136305] start_kernel+0x4d4/0x500quoted
quoted
quoted
The exception entry and exit use irq_enter() and irq_exit(), in this case, correct? Otherwise RCU will be ignoring this CPU.This is missing today, which sounds like the underlying bug.Agreed. I'm not so familier with how debug exception is handled on arm64, would it be a kind of NMI or IRQ?Debug exceptions can interrupt both SError (think: machine check) and pseudo-NMI, which both in turn interrupt interrupt-masked code. So they are a kind of NMI. But, be careful not to call 'nmi_enter()' twice, see do_serror() for how we work around this...OK. I think we can use rcu_nmi_enter/exit() as same as x86.
Adding this solves rcu_read_lock() warning issues too. So I will just replace [PATCH 3/3] with that.
quoted
quoted
Anyway, it seems that normal irqs are also not calling irq_enter/exit except for arch/arm64/kernel/smp.cdrivers/irqchip/irq-gic.c:gic_handle_irq() either calls handle_domain_irq() or handle_IPI(). The enter/exit calls live in those functions.Ah, I see. Would you think we need to put rcu_nmi_enter/exit() as similar to x86 on do_mem_abort() and do_sp_pc_abort() too?
Hmm, it seems that adding rcu_nmi_enter/exit to both function causes a failure of init process. At this moment I don't do that. Thank you, -- Masami Hiramatsu [off-list ref] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel