Re: [PATCH 0/37] PCI/MSI: Enforce explicit IRQ vector management by removing devres auto-free
From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: 2026-02-23 17:38:55
Also in:
dmaengine, dri-devel, linux-arm-msm, linux-crypto, linux-cxl, linux-gpio, linux-i2c, linux-i3c, linux-input, linux-iommu, linux-media, linux-mmc, linux-pci, linux-riscv, linux-serial, linux-spi, linux-usb, platform-driver-x86
On Tue, Feb 24, 2026 at 12:09:37AM +0800, Shawn Lin wrote:
在 2026/02/23 星期一 23:50, Andy Shevchenko 写道:quoted
On Mon, Feb 23, 2026 at 5:32 PM Shawn Lin [off-list ref] wrote:quoted
This patch series addresses a long-standing design issue in the PCI/MSI subsystem where the implicit, automatic management of IRQ vectors by the devres framework conflicts with explicit driver cleanup, creating ambiguity and potential resource management bugs. ==== The Problem: Implicit vs. Explicit Management ==== Historically, `pcim_enable_device()` not only manages standard PCI resources (BARs) via devres but also implicitly triggers automatic IRQ vector management by setting a flag that registers `pcim_msi_release()` as a cleanup action. This creates an ambiguous ownership model. Many drivers follow a pattern of: 1. Calling `pci_alloc_irq_vectors()` to allocate interrupts. 2. Also calling `pci_free_irq_vectors()` in their error paths or remove routines. When such a driver also uses `pcim_enable_device()`, the devres framework may attempt to free the IRQ vectors a second time upon device release, leading to a double-free. Analysis of the tree shows this hazardous pattern exists widely, while 35 other drivers correctly rely solely on the implicit cleanup.Is this confirmed? What I read from the cover letter, this series was only compile-tested, so how can you prove the problem exists in the first place?Yes, it's confirmed. My debug of a double free issue of a out-of-tree PCIe wifi driver which uses pcim_enable_device + pci_alloc_irq_vectors + pci_free_irq_vectors expose it. And we did have a TODO to cleanup this hybrid usage, targeted in this cycle[1] suggested by Philipp:
Okay, fair enough. I think this bit was missing in the cover letter.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=msi
quoted
quoted
==== The Solution: Making Management Explicit ==== This series enforces a clear, predictable model: 1. New Managed API (Patch 1/37): Introduces pcim_alloc_irq_vectors() and pcim_alloc_irq_vectors_affinity(). Drivers that desire devres-managed IRQ vectors should use these functions, which set the is_msi_managed flag and ensure automatic cleanup. 2. Patches 2 through 36 convert each driver that uses pcim_enable_device() alongside pci_alloc_irq_vectors() and relies on devres for IRQ vector cleanup to instead make an explicit call to pcim_alloc_irq_vectors(). 3. Core Change (Patch 37/37): With the former cleanup, now modifies pcim_setup_msi_release() to check only the is_msi_managed flag. This decouples automatic IRQ cleanup from pcim_enable_device(). IRQ vectors allocated via pci_alloc_irq_vectors*() are now solely the driver's responsibility to free with pci_free_irq_vectors(). With these changes, we clear ownership model: Explicit resource management eliminates ambiguity and follows the "principle of least surprise." New drivers choose one model and be consistent. - Use `pci_alloc_irq_vectors()` + `pci_free_irq_vectors()` for explicit control. - Use `pcim_alloc_irq_vectors()` for devres-managed, automatic cleanup.Have you checked previous attempts? Why is your series better than those?There seems not previous attempts.
Maybe we are looking to the different projects... https://lore.kernel.org/all/?q=pcim_alloc_irq_vectors
quoted
quoted
==== Testing And Review ==== 1. This series is only compiled test with allmodconfig. 2. Given the substantial size of this patch series, I have structured the mailing to facilitate efficient review. The cover letter, the first patch and the last one will be sent to all relevant mailing lists and key maintainers to ensure broad visibility and initial feedback on the overall approach. The remaining subsystem-specific patches will be sent only to the respective subsystem maintainers and their associated mailing lists, reducing noise.
-- With Best Regards, Andy Shevchenko