pci_reset_function() acquires device_lock before performing the reset.
pdsc_remove() is called by the PCI core with device_lock already held.
If pdsc_pci_reset_thread() is running when pdsc_remove() is called,
destroy_workqueue() will block waiting for the work to complete, while
the work is blocked waiting for device_lock - deadlock.
Use pci_try_reset_function() which uses pci_dev_trylock() internally.
This acquires both the device lock and the PCI config access lock
without blocking - if either lock is contended, it returns -EAGAIN
immediately. This avoids the deadlock while also ensuring proper
config space access serialization during the reset.
The pci_dev_get/put calls are also removed as they were unnecessary -
the driver-owned workqueue is destroyed in pdsc_remove(), guaranteeing
the work completes before remove returns. The PCI core holds its
reference to pci_dev throughout the entire unbind sequence.
Fixes: 81665adf25d2 ("pds_core: Fix pdsc_check_pci_health function to use work thread")
Reported-by: Sashiko AI Review <sashiko-bot@kernel.org>
Signed-off-by: Nikhil P. Rao <redacted>
---
drivers/net/ethernet/amd/pds_core/core.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/amd/pds_core/core.c b/drivers/net/ethernet/amd/pds_core/core.c
index 38a2446571af..1074a022a52f 100644
--- a/drivers/net/ethernet/amd/pds_core/core.c
+++ b/drivers/net/ethernet/amd/pds_core/core.c
@@ -606,9 +606,10 @@ void pdsc_pci_reset_thread(struct work_struct *work)
struct pdsc *pdsc = container_of(work, struct pdsc, pci_reset_work);
struct pci_dev *pdev = pdsc->pdev;
- pci_dev_get(pdev);
- pci_reset_function(pdev);
- pci_dev_put(pdev);
+ /* Use try variant to avoid deadlock with pdsc_remove().
+ * If lock is contended, the watchdog timer will retry.
+ */
+ pci_try_reset_function(pdev);
}
static void pdsc_check_pci_health(struct pdsc *pdsc)
--
2.43.0