Thread (24 messages) 24 messages, 5 authors, 2025-05-09

Re: [PATCH v5 5/9] x86/nmi: Add support to handle NMIs with source information

From: Sohil Mehta <hidden>
Date: 2025-05-07 21:49:17
Also in: kvm, linux-edac, linux-perf-users, linux-pm, lkml

On 5/7/2025 2:14 AM, Peter Zijlstra wrote:
On Tue, May 06, 2025 at 06:21:41PM -0700, Sohil Mehta wrote:
quoted
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index a1d672dcb6f0..183e3e717326 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
quoted
 static int nmi_handle(unsigned int type, struct pt_regs *regs)
 {
 	struct nmi_desc *desc = nmi_to_desc(type);
+	unsigned long source_bitmap = 0;
	unsigned long source = ~0UL;
Thanks! This makes the logic even simpler by getting rid of
match_nmi_source(). A minor change described further down.

Also, do you prefer "source" over "source_bitmap"? I had it as such to
avoid confusion between source_vector and source_bitmap.
quoted
 	nmi_handler_t ehandler;
 	struct nmiaction *a;
 	int handled=0;
@@ -148,16 +164,40 @@ static int nmi_handle(unsigned int type, struct pt_regs *regs)
 
 	rcu_read_lock();
 
+	/*
+	 * Activate NMI source-based filtering only for Local NMIs.
+	 *
+	 * Platform NMI types (such as SERR and IOCHK) have only one
+	 * handler registered per type, so there is no need to
+	 * disambiguate between multiple handlers.
+	 *
+	 * Also, if a platform source ends up setting bit 2 in the
+	 * source bitmap, the local NMI handlers would be skipped since
+	 * none of them use this reserved vector.
+	 *
+	 * For Unknown NMIs, avoid using the source bitmap to ensure all
+	 * potential handlers have a chance to claim responsibility.
+	 */
+	if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL)
+		source_bitmap = fred_event_data(regs);
	if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) {
		source = fred_event_data(regs);
		if (source & BIT(0))
			source = ~0UL;
	}
Looks good, except when fred_event_data() returns 0. I don't expect it
to happen in practice. But, maybe with new hardware and eventually
different hypervisors being involved, it is a possibility.

We can either call it a bug that an NMI happened without source
information. Or be extra nice and do this:

if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) {
	source = fred_event_data(regs);
	if (!source || (source & BIT(0)))
		source = ~0UL;
}
quoted
 	/*
 	 * NMIs are edge-triggered, which means if you have enough
 	 * of them concurrently, you can lose some because only one
 	 * can be latched at any given time.  Walk the whole list
 	 * to handle those situations.
+	 *
+	 * However, NMI-source reporting does not have this limitation.
+	 * When NMI-source information is available, only run the
+	 * handlers that match the reported vectors.
 	 */
 	list_for_each_entry_rcu(a, &desc->head, list) {
 		int thishandled;
 		u64 delta;
 
+		if (source_bitmap && !match_nmi_source(source_bitmap, a))
+			continue;
		if (!(souce & BIT(a->source_vector)))
			continue;
quoted
 		delta = sched_clock();
 		thishandled = a->handler(type, regs);
 		handled += thishandled;
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help