Re: [PATCH v6 0/4] PCI: Add support for resetting the Root Ports in a platform specific way
From: Brian Norris <briannorris@chromium.org>
Date: 2025-08-28 20:01:55
Also in:
linux-arm-kernel, linux-arm-msm, linux-pci, linux-rockchip, lkml
On Tue, Jul 15, 2025 at 07:51:03PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
Hi, Currently, in the event of AER/DPC, PCI core will try to reset the slot (Root Port) and its subordinate devices by invoking bridge control reset and FLR. But in some cases like AER Fatal error, it might be necessary to reset the Root Ports using the PCI host bridge drivers in a platform specific way (as indicated by the TODO in the pcie_do_recovery() function in drivers/pci/pcie/err.c). Otherwise, the PCI link won't be recovered successfully. So this series adds a new callback 'pci_host_bridge::reset_root_port' for the host bridge drivers to reset the Root Port when a fatal error happens. Also, this series allows the host bridge drivers to handle PCI link down event by resetting the Root Ports and recovering the bus. This is accomplished by the help of the new 'pci_host_handle_link_down()' API. Host bridge drivers are expected to call this API (preferrably from a threaded IRQ handler) with relevant Root Port 'pci_dev' when a link down event is detected for the port. The API will reuse the pcie_do_recovery() function to recover the link if AER support is enabled, otherwise it will directly call the reset_root_port() callback of the host bridge driver (if exists). For reference, I've modified the pcie-qcom driver to call pci_host_handle_link_down() API with Root Port 'pci_dev' after receiving the LINK_DOWN global_irq event and populated 'pci_host_bridge::reset_root_port()' callback to reset the Root Port. Since the Qcom PCIe controllers support only a single Root Port (slot) per controller instance, the API is going to be invoked only once. For multi Root Port controllers, the controller driver is expected to detect the Root Port that received the link down event and call the pci_host_handle_link_down() API with 'pci_dev' of that Root Port. Testing ------- I've lost access to my test setup now. So Krishna (Cced) will help with testing on the Qcom platform and Wilfred or Niklas should be able to test it on Rockchip platform. For the moment, this series is compile tested only.
For the series: Tested-by: Brian Norris <briannorris@chromium.org> I've tested the whole thing on Qualcomm SC7280 Herobrine systems with NVMe. After adding a debugfs node to control toggling PERST, I can force the link to reset, and see it recover and resume NVMe traffic. I've tested the first two on Pixel phones, using a non-upstream DWC-based driver that I'm working on getting in better shape. (We've previously supported a custom link-error API setup instead.) I'd love to see this available upstream. Regards, Brian