Re: [PATCHv2 1/5] arm64/entry-common: push the judgement of nmi ahead
From: Pingfan Liu <hidden>
Date: 2021-10-09 04:15:04
Also in:
lkml
On Fri, Oct 08, 2021 at 08:45:23AM -0700, Paul E. McKenney wrote:
On Fri, Oct 08, 2021 at 12:01:25PM +0800, Pingfan Liu wrote:quoted
Sorry that I missed this message and I am just back from a long festival. Adding Paul for RCU guidance.Didn't the recent patch series cover this, or is this a new problem?
Sorry not to explain it clearly. This is a new problem.
The acked recent series derive from [3-4/5], which addresses the nested calling:
in a single normal interrupt handler
rcu_irq_enter()
rcu_irq_enter()
...
rcu_irq_exit()
rcu_irq_exit()
While this new problem [1-2/5] is about pNMI (similar to NMI in this context).
On arm64, the current process in a pNMI handler looks like:
rcu_irq_enter(){ rcu_nmi_enter()}
^^^ At this point, the handler is treated as a normal interrupt temporary, (no chance to __preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET);).
So rcu_nmi_enter() can not distinguish NMI, because "if (!in_nmi())" can not tell it. (goto "questionA")
nmi_enter()
NMI handler
nmi_exit()
rcu_irq_exit()
[...]quoted
Refer to rcu_nmi_enter(), which can be called by enter_from_kernel_mode(): ||noinstr void rcu_nmi_enter(void) ||{ || ... || if (rcu_dynticks_curr_cpu_in_eqs()) { || || if (!in_nmi()) || rcu_dynticks_task_exit(); || || // RCU is not watching here ... || rcu_dynticks_eqs_exit(); || // ... but is watching here. || || if (!in_nmi()) { || instrumentation_begin(); || rcu_cleanup_after_idle(); || instrumentation_end(); || } || || instrumentation_begin(); || // instrumentation for the noinstr rcu_dynticks_curr_cpu_in_eqs() || instrument_atomic_read(&rdp->dynticks, sizeof(rdp->dynticks)); || // instrumentation for the noinstr rcu_dynticks_eqs_exit() || instrument_atomic_write(&rdp->dynticks, sizeof(rdp->dynticks)); || || incby = 1; || } else if (!in_nmi()) { || instrumentation_begin(); || rcu_irq_enter_check_tick(); || } else { || instrumentation_begin(); || } || ... ||} There is 3 pieces of code put under the protection of if (!in_nmi()). At least the last one "rcu_irq_enter_check_tick()" can trigger a hard lock up bug. Because it is supposed to hold a spin lock with irqoff by "raw_spin_lock_rcu_node(rdp->mynode)", but pNMI can breach it. The same scenario in rcu_nmi_exit()->rcu_prepare_for_idle().
questionA:
quoted
As for the first two "if (!in_nmi())", I have no idea of why, except breaching spin_lock_irq() by NMI. Hope Paul can give some guide.
Thanks, Pingfan _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel