Re: [PATCH 1/2] PCI: Ensure error recoverability at all times
From: Lukas Wunner <lukas@wunner.de>
Date: 2025-11-13 09:38:18
Also in:
linux-pci
On Wed, Nov 12, 2025 at 04:38:31PM -0600, Bjorn Helgaas wrote:
On Sun, Oct 12, 2025 at 03:25:01PM +0200, Lukas Wunner wrote:quoted
Despite these workarounds, recoverability at all times is not guaranteed: E.g. when a PCIe port goes through a runtime suspend and resume cycle, the "saved_state" flag is cleared by: pci_pm_runtime_resume() pci_pm_default_resume_early() pci_restore_state() ... and hence on a subsequent AER event, the port's Config Space cannot be restored.I guess this restore would be done by a driver's pci_error_handlers.slot_reset() or .reset_done() calling pci_restore_state()?
Yes. Restoration of config space after an error-recovery-induced reset is currently always the job of the device driver. E.g. in the case of portdrv, it happens in pcie_portdrv_slot_reset(). We could revisit this design decision and change the behavior to have pcie_do_recovery() call pci_restore_state(), thus reducing boilerplate in the drivers. But that would be a separate effort, orthogonal to the present patch.
quoted
+++ b/drivers/pci/bus.c@@ -358,6 +358,13 @@ void pci_bus_add_device(struct pci_dev *dev) pci_bridge_d3_update(dev); /* + * Save config space for error recoverability. Clear state_saved + * to detect whether drivers invoked pci_save_state() on suspend.Can we expand this a little to explain how this is detected and what drivers *should* be doing?
That is documented in Documentation/power/pci.rst, "3.1.2. suspend()":
"This callback is expected to quiesce the device and prepare it to be
put into a low-power state by the PCI subsystem. It is not required
(in fact it even is not recommended) that a PCI driver's suspend()
callback save the standard configuration registers of the device [...]
However, in some rare case it is convenient to carry out these
operations in a PCI driver. Then, pci_save_state() [...] should be
used to save the device's standard configuration registers [...].
Moreover, if the driver calls pci_save_state(), the PCI subsystem will
not execute either pci_prepare_to_sleep(), or pci_set_power_state()
for its device, so the driver is then responsible for handling the
device as appropriate."
I think the reason is that the PCI core can invoke pci_save_state() on suspend if the driver did not.
Right. By calling pci_save_state(), a driver signals to the PCI core that it assumes responsibility for putting the device into a low power state. If a driver wants to keep a device in D0, it could call pci_save_state() and thus prevent the PCI core from putting it e.g. into D3.
I assume:
- PCI core always calls pci_save_state() and clears state_saved when
device is enumerated (below)
- When it has configured the device to the state it wants restore,
the driver may call pci_save_state() again, which will set
state_saved
- If driver has not called pci_save_state(), i.e., state_saved is
still clear, we want the PCI core to call pci_save_state() during
suspendRight.
This sounds sensible to me. It would be nice if there were a few more words about pci_save_state() and pci_restore_state() in Documentation/. pci_save_state() isn't mentioned at all in Documentation/PCI
Right, it's documented in the Documentation/power directory. :) The "state_saved" flag in struct pci_dev is an internal flag used by the PCI core to keep track of whether a driver called pci_save_state() on suspend. The logic to update the flag is not modified by the patch, deliberately so to avoid any breakage. The flag is currently initialized to false in pci_device_add() (even though it already is false due to kzalloc() zeroing the memory). I'm now later calling pci_save_state() in pci_bus_add_device(), which sets the flag to true. To preserve the existing logic, I am resetting the flag to false again. The only change made by the patch is to not invalidate the saved state upon pci_restore_state() and thus allow re-using it for error recovery. The patch seeks to avoid changing the behavior of suspend/resume. I wanted to keep this minimal, non-intrusive and as low risk as possible. Thanks, Lukas