Re: [PATCH] powerpc/eeh: avoid possible crash when edev->pdev changes
From: Ganesh G R <hidden>
Date: 2024-06-13 13:49:55
On 6/11/24 8:18 AM, Michael Ellerman wrote:
Hi Ganesh, Ganesh Goudar [off-list ref] writes:quoted
If a PCI device is removed during eeh_pe_report_edev(), edev->pdev will change and can cause a crash, hold the PCI rescan/remove lock while taking a copy of edev->pdev. Signed-off-by: Ganesh Goudar <redacted> --- arch/powerpc/kernel/eeh_pe.c | 2 ++ 1 file changed, 2 insertions(+)diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c index d1030bc52564..49f968733912 100644 --- a/arch/powerpc/kernel/eeh_pe.c +++ b/arch/powerpc/kernel/eeh_pe.c@@ -859,7 +859,9 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe) /* Retrieve the parent PCI bus of first (top) PCI device */ edev = list_first_entry_or_null(&pe->edevs, struct eeh_dev, entry); + pci_lock_rescan_remove(); pdev = eeh_dev_to_pci_dev(edev); + pci_unlock_rescan_remove(); if (pdev) return pdev->bus;What prevents pdev being freed/reused immediately after you drop the rescan/remove lock?
Yeah, I should have released the lock after getting bus address, I will send v2.
AFAICS eeh_dev_to_pci_dev() doesn't take an additional reference to the pdev or anything.
Yes, I think we have to evaluate the possible eventualities of not taking the reference in all the cases. But we need this lock here because, if the PCI error is encountered in the hotplug remove path, we need the pci rescan lock to avoid race between hotplug remove path and the bottom half of EEH recovery, this lets the hotplug remove to complete since it is already holding the lock and drop the recovery process as the device is no longer present.