Re: [PATCH v3 1/6] PCI: pnv_php: Properly clean up allocated IRQs on unplug
From: Jiri Slaby <jirislaby@kernel.org>
Date: 2025-09-09 09:00:45
Also in:
linux-pci, lkml
On 15. 07. 25, 23:36, Timothy Pearson wrote:
quoted hunk ↗ jump to hunk
In cases where the root of a nested PCIe bridge configuration is unplugged, the pnv_php driver would leak the allocated IRQ resources for the child bridges' hotplug event notifications, resulting in a panic. Fix this by walking all child buses and deallocating all it's IRQ resources before calling pci_hp_remove_devices. Also modify the lifetime of the workqueue at struct pnv_php_slot::wq so that it is only destroyed in pnv_php_free_slot, instead of pnv_php_disable_irq. This is required since pnv_php_disable_irq will now be called by workers triggered by hot unplug interrupts, so the workqueue needs to stay allocated. The abridged kernel panic that occurs without this patch is as follows: WARNING: CPU: 0 PID: 687 at kernel/irq/msi.c:292 msi_device_data_release+0x6c/0x9c CPU: 0 UID: 0 PID: 687 Comm: bash Not tainted 6.14.0-rc5+ #2 Call Trace: msi_device_data_release+0x34/0x9c (unreliable) release_nodes+0x64/0x13c devres_release_all+0xc0/0x140 device_del+0x2d4/0x46c pci_destroy_dev+0x5c/0x194 pci_hp_remove_devices+0x90/0x128 pci_hp_remove_devices+0x44/0x128 pnv_php_disable_slot+0x54/0xd4 power_write_file+0xf8/0x18c pci_slot_attr_store+0x40/0x5c sysfs_kf_write+0x64/0x78 kernfs_fop_write_iter+0x1b0/0x290 vfs_write+0x3bc/0x50c ksys_write+0x84/0x140 system_call_exception+0x124/0x230 system_call_vectored_common+0x15c/0x2ec Signed-off-by: Shawn Anastasio <redacted> Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> --- drivers/pci/hotplug/pnv_php.c | 94 ++++++++++++++++++++++++++++------- 1 file changed, 75 insertions(+), 19 deletions(-)diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index 573a41869c15..aec0a6d594ac 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c
...
quoted hunk ↗ jump to hunk
@@ -647,6 +702,15 @@ static struct pnv_php_slot *pnv_php_alloc_slot(struct device_node *dn)
This is preceded by:
php_slot = kzalloc(sizeof(*php_slot), GFP_KERNEL);
Ie. php_slot is zeroed.
return NULL;
}
+ /* Allocate workqueue for this slot's interrupt handling */
+ php_slot->wq = alloc_workqueue("pciehp-%s", 0, 0, php_slot->name);
+ if (!php_slot->wq) {
+ SLOT_WARN(php_slot, "Cannot alloc workqueue\n");I believe this introduced a (unlikely) NULL ptr dereference.
+ kfree(php_slot->name); + kfree(php_slot); + return NULL; + } + if (dn->child && PCI_DN(dn->child)) php_slot->slot_no = PCI_SLOT(PCI_DN(dn->child)->devfn); else
This continues:
php_slot->pdev = bus->self;
php_slot->bus = bus;
And SLOT_WARN() is defined as:
#define SLOT_WARN(sl, x...) \
((sl)->pdev ? pci_warn((sl)->pdev, x) :
dev_warn(&(sl)->bus->dev, x))
The else branch is alkays taken in the 'if' above, which still
dereferences NULLed (sl)->bus here.
quoted hunk ↗ jump to hunk
@@ -843,14 +907,6 @@ static void pnv_php_init_irq(struct pnv_php_slot *php_slot, int irq) u16 sts, ctrl; int ret; - /* Allocate workqueue */ - php_slot->wq = alloc_workqueue("pciehp-%s", 0, 0, php_slot->name); - if (!php_slot->wq) { - SLOT_WARN(php_slot, "Cannot alloc workqueue\n");
Here, php_slot used to have both ->pdev and ->bus assigned at this point.
- pnv_php_disable_irq(php_slot, true); - return; - } -
Right? thanks, -- js suse labs