Re: [PATCH 1/2] PCI: Ensure error recoverability at all times
From: Bjorn Helgaas <helgaas@kernel.org>
Date: 2025-11-13 16:16:07
Also in:
linux-pci
On Thu, Nov 13, 2025 at 10:38:09AM +0100, Lukas Wunner wrote:
On Wed, Nov 12, 2025 at 04:38:31PM -0600, Bjorn Helgaas wrote:quoted
On Sun, Oct 12, 2025 at 03:25:01PM +0200, Lukas Wunner wrote:quoted
Despite these workarounds, recoverability at all times is not guaranteed: E.g. when a PCIe port goes through a runtime suspend and resume cycle, the "saved_state" flag is cleared by: pci_pm_runtime_resume() pci_pm_default_resume_early() pci_restore_state() ... and hence on a subsequent AER event, the port's Config Space cannot be restored.I guess this restore would be done by a driver's pci_error_handlers.slot_reset() or .reset_done() calling pci_restore_state()?Yes. Restoration of config space after an error-recovery-induced reset is currently always the job of the device driver. E.g. in the case of portdrv, it happens in pcie_portdrv_slot_reset(). We could revisit this design decision and change the behavior to have pcie_do_recovery() call pci_restore_state(), thus reducing boilerplate in the drivers. But that would be a separate effort, orthogonal to the present patch.
Agreed.
quoted
quoted
+++ b/drivers/pci/bus.c@@ -358,6 +358,13 @@ void pci_bus_add_device(struct pci_dev *dev) pci_bridge_d3_update(dev); /* + * Save config space for error recoverability. Clear state_saved + * to detect whether drivers invoked pci_save_state() on suspend.Can we expand this a little to explain how this is detected and what drivers *should* be doing?That is documented in Documentation/power/pci.rst, "3.1.2. suspend()": "This callback is expected to quiesce the device and prepare it to be put into a low-power state by the PCI subsystem. It is not required (in fact it even is not recommended) that a PCI driver's suspend() callback save the standard configuration registers of the device [...] However, in some rare case it is convenient to carry out these operations in a PCI driver. Then, pci_save_state() [...] should be used to save the device's standard configuration registers [...]. Moreover, if the driver calls pci_save_state(), the PCI subsystem will not execute either pci_prepare_to_sleep(), or pci_set_power_state() for its device, so the driver is then responsible for handling the device as appropriate."
Yes. I should have proposed some text for the comment, e.g., Save config space for error recoverability. Clear state_saved. If driver calls pci_save_state() again, state_saved will be set and we'll know that on suspend, the PCI core shouldn't call pci_save_state() or change the device power state.
quoted
I think the reason is that the PCI core can invoke pci_save_state() on suspend if the driver did not.Right. By calling pci_save_state(), a driver signals to the PCI core that it assumes responsibility for putting the device into a low power state. If a driver wants to keep a device in D0, it could call pci_save_state() and thus prevent the PCI core from putting it e.g. into D3.
It seems like there are two things going on here, and I'm not sure
they're completely compatible:
1) Driver calls pci_save_state() to take over device power
management and prevent the PCI core from doing it.
2) Driver calls pci_save_state() to capture the device state it
wants to restore when recovering from an error.
Shouldn't a driver be able to do 2) without also getting 1)?
quoted
I assume: - PCI core always calls pci_save_state() and clears state_saved when device is enumerated (below) - When it has configured the device to the state it wants restore, the driver may call pci_save_state() again, which will set state_saved - If driver has not called pci_save_state(), i.e., state_saved is still clear, we want the PCI core to call pci_save_state() during suspendRight.quoted
This sounds sensible to me. It would be nice if there were a few more words about pci_save_state() and pci_restore_state() in Documentation/. pci_save_state() isn't mentioned at all in Documentation/PCIRight, it's documented in the Documentation/power directory. :)
Yes, in the pci.rst I mentioned, but it mostly uses the "saves the device's standard configuration registers" wording. I'm just wishing for a more concrete mention of "pci_save_state()", since that's where the critical "state_saved" flag is updated. And I'm not sure Documentation/ includes anything about the idea of a driver using pci_save_state() to capture the state it wants to restore after an error. I guess that's obvious now that I write it down, but I'm sure learning a lot from this conversation :) Bjorn