Thread (51 messages) 51 messages, 3 authors, 2020-10-09

Re: [PATCH 10/13] x86/irq: Limit IOAPIC and MSI domains' affinity without IR

From: David Woodhouse <dwmw2@infradead.org>
Date: 2020-10-07 07:48:58
Also in: kvm, linux-iommu

On Tue, 2020-10-06 at 23:54 +0200, Thomas Gleixner wrote:
On Mon, Oct 05 2020 at 16:28, David Woodhouse wrote:
quoted
From: David Woodhouse <redacted>

When interrupt remapping isn't enabled, only the first 255 CPUs can
No, only CPUs with an APICid < 255 ....
Ack.
quoted
receive external interrupts. Set the appropriate max affinity for
the IOAPIC and MSI IRQ domains accordingly.

This also fixes the case where interrupt remapping is enabled but some
devices are not within the scope of any active IOMMU.
What? If this fixes an pre-existing problem then

      1) Explain the problem proper
      2) Have a patch at the beginning of the series which fixes it
         independently of this pile

If it's fixing a problem in your pile, then you got the ordering wrong.
It's not that simple; there's not a single patch which fixes that and
which can go first. I can, and do, fix the "no IR" case in a simple
patch that goes first, simply by restricting the kernel to the APIC IDs
which can be targeted.

This is the case I called out in the cover letter:

    This patch series implements a per-domain "maximum affinity" set and
    uses it for the non-remapped IOAPIC and MSI domains on x86. As well as
    allowing more CPUs to be used without interrupt remapping, this also
    fixes the case where some IOAPICs or PCI devices aren't actually in
    scope of any active IOMMU and are operating without remapping.

To fix *that* case, we really do need the whole series giving us per-
domain restricted affinity, and to use it for those MSIs/IOAPICs that
the IRQ remapping doesn't cover.
You didn't start kernel programming as of yesterday, so you really know
how that works.
quoted
 	ip->irqdomain->parent = parent;
+	if (parent == x86_vector_domain)
+		irq_domain_set_affinity(ip->irqdomain, &x86_non_ir_cpumask);
OMG
This IOAPIC function may or may not be behind remapping.
quoted
 	if (cfg->type == IOAPIC_DOMAIN_LEGACY ||
 	    cfg->type == IOAPIC_DOMAIN_STRICT)
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 4d891967bea4..af5ce5c4da02 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -259,6 +259,7 @@ struct irq_domain * __init native_create_pci_msi_domain(void)
 		pr_warn("Failed to initialize PCI-MSI irqdomain.\n");
 	} else {
 		d->flags |= IRQ_DOMAIN_MSI_NOMASK_QUIRK;
+		irq_domain_set_affinity(d, &x86_non_ir_cpumask);
So here it's unconditional
Yes, this code is only for the non-remapped case and there's a separate
arch_create_remap_msi_irq_domain() function a few lines further down
which creates the IR-PCI-MSI IRQ domain. So no need for a condition
here.
quoted
 	}
 	return d;
 }
@@ -479,6 +480,8 @@ struct irq_domain *hpet_create_irq_domain(int hpet_id)
 		irq_domain_free_fwnode(fn);
 		kfree(domain_info);
 	}
+	if (parent == x86_vector_domain)
+		irq_domain_set_affinity(d, &x86_non_ir_cpumask);
And here we need a condition again. Completely obvious and reviewable - NOT.
And HPET may or may not be behind remapping so again needs the
condition. I had figured that part was fairly obvious but can note it
in the commit comment.

Attachments

  • smime.p7s [application/x-pkcs7-signature] 5174 bytes
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help