RE: [5.14-rc1] mlx5_core receives no interrupts with maxcpus=8
From: Dexuan Cui <decui@microsoft.com>
Date: 2021-08-18 21:08:25
Also in:
linux-pci, lkml
From: Thomas Gleixner <redacted> Sent: Wednesday, July 21, 2021 2:17 PM To: Dexuan Cui <decui@microsoft.com>; Saeed Mahameed On Mon, Jul 19 2021 at 20:33, Dexuan Cui wrote:quoted
This is a bare metal x86-64 host with Intel CPUs. Yes, I believe the issue is in the IOMMU Interrupt Remapping mechanism rather in the NIC driver. I just don't understand why bringing the CPUs online and offline can work around the issue. I'm trying to dump the IOMMU IR table entries to look for any error.can you please enable GENERIC_IRQ_DEBUGFS and provide the output of cat /sys/kernel/debug/irq/irqs/$THENICIRQS Thanks, tglx
Sorry for the late response! I checked the below sys file, and the output is
exactly the same in the good/bad cases -- in both cases, I use maxcpus=8;
the only difference in the good case is that I online and then offline CPU 8~31:
for i in `seq 8 31`; do echo 1 > /sys/devices/system/cpu/cpu$i/online; done
for i in `seq 8 31`; do echo 0 > /sys/devices/system/cpu/cpu$i/online; done
# cat /sys/kernel/debug/irq/irqs/209
handler: handle_edge_irq
device: 0000:d8:00.0
status: 0x00004000
istate: 0x00000000
ddepth: 0
wdepth: 0
dstate: 0x35409200
IRQD_ACTIVATED
IRQD_IRQ_STARTED
IRQD_SINGLE_TARGET
IRQD_MOVE_PCNTXT
IRQD_AFFINITY_SET
IRQD_AFFINITY_ON_ACTIVATE
IRQD_CAN_RESERVE
IRQD_HANDLE_ENFORCE_IRQCTX
node: 1
affinity: 0-7
effectiv: 5
pending:
domain: INTEL-IR-MSI-3-3
hwirq: 0x6c00000
chip: IR-PCI-MSI
flags: 0x30
IRQCHIP_SKIP_SET_WAKE
IRQCHIP_ONESHOT_SAFE
parent:
domain: INTEL-IR-3
hwirq: 0x20000
chip: INTEL-IR
flags: 0x0
parent:
domain: VECTOR
hwirq: 0xd1
chip: APIC
flags: 0x0
Vector: 42
Target: 5
move_in_progress: 0
is_managed: 0
can_reserve: 1
has_reserved: 0
cleanup_pending: 0
Thanks,
Dexuan