Re: [PATCH v9 00/14] pci: Work around ASMedia ASM2824 PCIe link training failures
From: Bjorn Helgaas <helgaas@kernel.org>
Date: 2023-06-14 23:20:50
Also in:
linux-pci, linux-rdma, lkml, netdev
Subsystem:
pci subsystem, the rest · Maintainers:
Bjorn Helgaas, Linus Torvalds
On Sun, Jun 11, 2023 at 06:19:08PM +0100, Maciej W. Rozycki wrote:
Hi, This is v9 of the change to work around a PCIe link training phenomenon where a pair of devices both capable of operating at a link speed above 2.5GT/s seems unable to negotiate the link speed and continues training indefinitely with the Link Training bit switching on and off repeatedly and the data link layer never reaching the active state. With several requests addressed and a few extra issues spotted this version has now grown to 14 patches. It has been verified for device enumeration with and without PCI_QUIRKS enabled, using the same piece of RISC-V hardware as previously. Hot plug or reset events have not been verified, as this is difficult if at all feasible with hardware in question. Last iteration: <https://lore.kernel.org/r/alpine.DEB.2.21.2304060100160.13659@angie.orcam.me.uk/ (local)>, and my input to it: <https://lore.kernel.org/r/alpine.DEB.2.21.2306080224280.36323@angie.orcam.me.uk/ (local)>.
Thanks, I applied these to pci/enumeration for v6.5.
I tweaked a few things, so double-check to be sure I didn't break
something:
- Moved dev->link_active_reporting init to set_pcie_port_type()
because it does other PCIe-related stuff.
- Reordered to keep all the link_active_reporting things together.
- Reordered to clean up & factor pcie_retrain_link() before exposing
it to the rest of the PCI core.
- Moved pcie_retrain_link() a little earlier to keep it next to
pcie_wait_for_link_status().
- Squashed the stubs into the actual quirk so we don't have the
intermediate state where we call the stubs but they never do
anything (let me know if there's a reason we need your order).
- Inline pcie_parent_link_retrain(), which seemed like it didn't add
enough to be worthwhile.
Interdiff below:
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 80694e2574b8..f11268924c8f 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c@@ -1153,27 +1153,16 @@ void pci_resume_bus(struct pci_bus *bus) pci_walk_bus(bus, pci_resume_one, NULL); } -/** - * pcie_parent_link_retrain - Check and retrain link we are downstream from - * @dev: PCI device to handle. - * - * Return TRUE if the link was retrained, FALSE otherwise. - */ -static bool pcie_parent_link_retrain(struct pci_dev *dev) -{ - struct pci_dev *bridge; - - bridge = pci_upstream_bridge(dev); - if (bridge) - return pcie_failed_link_retrain(bridge); - else - return false; -} - static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout) { - bool retrain = true; int delay = 1; + bool retrain = false; + struct pci_dev *bridge; + + if (pci_is_pcie(dev)) { + retrain = true; + bridge = pci_upstream_bridge(dev); + } /* * After reset, the device should not silently discard config
@@ -1201,9 +1190,9 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout) } if (delay > PCI_RESET_WAIT) { - if (retrain) { + if (retrain && bridge) { retrain = false; - if (pcie_parent_link_retrain(dev)) { + if (pcie_failed_link_retrain(bridge)) { delay = 1; continue; }
@@ -4914,6 +4903,38 @@ static bool pcie_wait_for_link_status(struct pci_dev *pdev, return (lnksta & lnksta_mask) == lnksta_match; } +/** + * pcie_retrain_link - Request a link retrain and wait for it to complete + * @pdev: Device whose link to retrain. + * @use_lt: Use the LT bit if TRUE, or the DLLLA bit if FALSE, for status. + * + * Retrain completion status is retrieved from the Link Status Register + * according to @use_lt. It is not verified whether the use of the DLLLA + * bit is valid. + * + * Return TRUE if successful, or FALSE if training has not completed + * within PCIE_LINK_RETRAIN_TIMEOUT_MS milliseconds. + */ +bool pcie_retrain_link(struct pci_dev *pdev, bool use_lt) +{ + u16 lnkctl; + + pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, &lnkctl); + lnkctl |= PCI_EXP_LNKCTL_RL; + pcie_capability_write_word(pdev, PCI_EXP_LNKCTL, lnkctl); + if (pdev->clear_retrain_link) { + /* + * Due to an erratum in some devices the Retrain Link bit + * needs to be cleared again manually to allow the link + * training to succeed. + */ + lnkctl &= ~PCI_EXP_LNKCTL_RL; + pcie_capability_write_word(pdev, PCI_EXP_LNKCTL, lnkctl); + } + + return pcie_wait_for_link_status(pdev, use_lt, !use_lt); +} + /** * pcie_wait_for_link_delay - Wait until link is active or inactive * @pdev: Bridge device
@@ -4968,37 +4989,6 @@ bool pcie_wait_for_link(struct pci_dev *pdev, bool active) return pcie_wait_for_link_delay(pdev, active, 100); } -/** - * pcie_retrain_link - Request a link retrain and wait for it to complete - * @pdev: Device whose link to retrain. - * @use_lt: Use the LT bit if TRUE, or the DLLLA bit if FALSE, for status. - * - * Retrain completion status is retrieved from the Link Status Register - * according to @use_lt. It is not verified whether the use of the DLLLA - * bit is valid. - * - * Return TRUE if successful, or FALSE if training has not completed. - */ -bool pcie_retrain_link(struct pci_dev *pdev, bool use_lt) -{ - u16 lnkctl; - - pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, &lnkctl); - lnkctl |= PCI_EXP_LNKCTL_RL; - pcie_capability_write_word(pdev, PCI_EXP_LNKCTL, lnkctl); - if (pdev->clear_retrain_link) { - /* - * Due to an erratum in some devices the Retrain Link bit - * needs to be cleared again manually to allow the link - * training to succeed. - */ - lnkctl &= ~PCI_EXP_LNKCTL_RL; - pcie_capability_write_word(pdev, PCI_EXP_LNKCTL, lnkctl); - } - - return pcie_wait_for_link_status(pdev, use_lt, !use_lt); -} - /* * Find maximum D3cold delay required by all the devices on the bus. The * spec says 100 ms, but firmware can lower it and we allow drivers to
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 016a9d4a61f7..f547db0a728f 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c@@ -1526,6 +1526,7 @@ void set_pcie_port_type(struct pci_dev *pdev) { int pos; u16 reg16; + u32 reg32; int type; struct pci_dev *parent;
@@ -1539,6 +1540,10 @@ void set_pcie_port_type(struct pci_dev *pdev) pci_read_config_dword(pdev, pos + PCI_EXP_DEVCAP, &pdev->devcap); pdev->pcie_mpss = FIELD_GET(PCI_EXP_DEVCAP_PAYLOAD, pdev->devcap); + pcie_capability_read_dword(pdev, PCI_EXP_LNKCAP, ®32); + if (reg32 & PCI_EXP_LNKCAP_DLLLARC) + pdev->link_active_reporting = 1; + parent = pci_upstream_bridge(pdev); if (!parent) return;
@@ -1828,7 +1833,6 @@ int pci_setup_device(struct pci_dev *dev) int err, pos = 0; struct pci_bus_region region; struct resource *res; - u32 linkcap; hdr_type = pci_hdr_type(dev);
@@ -1876,10 +1880,6 @@ int pci_setup_device(struct pci_dev *dev) /* "Unknown power state" */ dev->current_state = PCI_UNKNOWN; - /* Set it early to make it available to fixups, etc. */ - pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &linkcap); - dev->link_active_reporting = !!(linkcap & PCI_EXP_LNKCAP_DLLLARC); - /* Early fixups, before probing the BARs */ pci_fixup_device(pci_fixup_early, dev);