Re: [PATCH v5 5/9] x86/nmi: Add support to handle NMIs with source information
From: Sohil Mehta <hidden>
Date: 2025-05-07 21:49:17
Also in:
kvm, linux-edac, linux-perf-users, linux-pm, lkml
On 5/7/2025 2:14 AM, Peter Zijlstra wrote:
On Tue, May 06, 2025 at 06:21:41PM -0700, Sohil Mehta wrote:quoted
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c index a1d672dcb6f0..183e3e717326 100644 --- a/arch/x86/kernel/nmi.c +++ b/arch/x86/kernel/nmi.cquoted
static int nmi_handle(unsigned int type, struct pt_regs *regs) { struct nmi_desc *desc = nmi_to_desc(type); + unsigned long source_bitmap = 0;unsigned long source = ~0UL;
Thanks! This makes the logic even simpler by getting rid of match_nmi_source(). A minor change described further down. Also, do you prefer "source" over "source_bitmap"? I had it as such to avoid confusion between source_vector and source_bitmap.
quoted
nmi_handler_t ehandler; struct nmiaction *a; int handled=0;@@ -148,16 +164,40 @@ static int nmi_handle(unsigned int type, struct pt_regs *regs) rcu_read_lock(); + /* + * Activate NMI source-based filtering only for Local NMIs. + * + * Platform NMI types (such as SERR and IOCHK) have only one + * handler registered per type, so there is no need to + * disambiguate between multiple handlers. + * + * Also, if a platform source ends up setting bit 2 in the + * source bitmap, the local NMI handlers would be skipped since + * none of them use this reserved vector. + * + * For Unknown NMIs, avoid using the source bitmap to ensure all + * potential handlers have a chance to claim responsibility. + */ + if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) + source_bitmap = fred_event_data(regs);if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) { source = fred_event_data(regs); if (source & BIT(0)) source = ~0UL; }
Looks good, except when fred_event_data() returns 0. I don't expect it
to happen in practice. But, maybe with new hardware and eventually
different hypervisors being involved, it is a possibility.
We can either call it a bug that an NMI happened without source
information. Or be extra nice and do this:
if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) {
source = fred_event_data(regs);
if (!source || (source & BIT(0)))
source = ~0UL;
}
quoted
/* * NMIs are edge-triggered, which means if you have enough * of them concurrently, you can lose some because only one * can be latched at any given time. Walk the whole list * to handle those situations. + * + * However, NMI-source reporting does not have this limitation. + * When NMI-source information is available, only run the + * handlers that match the reported vectors. */ list_for_each_entry_rcu(a, &desc->head, list) { int thishandled; u64 delta; + if (source_bitmap && !match_nmi_source(source_bitmap, a)) + continue;if (!(souce & BIT(a->source_vector))) continue;quoted
delta = sched_clock(); thishandled = a->handler(type, regs); handled += thishandled;