Re: [PATCH v9 00/14] pci: Work around ASMedia ASM2824 PCIe link training failures
From: "Maciej W. Rozycki" <macro@orcam.me.uk>
Date: 2023-06-15 00:41:17
Also in:
linux-pci, linux-rdma, linuxppc-dev, lkml
On Wed, 14 Jun 2023, Bjorn Helgaas wrote:
quoted
This is v9 of the change to work around a PCIe link training phenomenon where a pair of devices both capable of operating at a link speed above 2.5GT/s seems unable to negotiate the link speed and continues training indefinitely with the Link Training bit switching on and off repeatedly and the data link layer never reaching the active state. With several requests addressed and a few extra issues spotted this version has now grown to 14 patches. It has been verified for device enumeration with and without PCI_QUIRKS enabled, using the same piece of RISC-V hardware as previously. Hot plug or reset events have not been verified, as this is difficult if at all feasible with hardware in question. Last iteration: <https://lore.kernel.org/r/alpine.DEB.2.21.2304060100160.13659@angie.orcam.me.uk/ (local)>, and my input to it: <https://lore.kernel.org/r/alpine.DEB.2.21.2306080224280.36323@angie.orcam.me.uk/ (local)>.Thanks, I applied these to pci/enumeration for v6.5.
Great, thanks!
I tweaked a few things, so double-check to be sure I didn't break
something:
- Moved dev->link_active_reporting init to set_pcie_port_type()
because it does other PCIe-related stuff.
- Reordered to keep all the link_active_reporting things together.
- Reordered to clean up & factor pcie_retrain_link() before exposing
it to the rest of the PCI core.
- Moved pcie_retrain_link() a little earlier to keep it next to
pcie_wait_for_link_status().
- Squashed the stubs into the actual quirk so we don't have the
intermediate state where we call the stubs but they never do
anything (let me know if there's a reason we need your order).
- Inline pcie_parent_link_retrain(), which seemed like it didn't add
enough to be worthwhile.Ack, I'll double-check and report back. A minor nit I've spotted below:
static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
{
- bool retrain = true;
int delay = 1;
+ bool retrain = false;
+ struct pci_dev *bridge;
+
+ if (pci_is_pcie(dev)) {
+ retrain = true;
+ bridge = pci_upstream_bridge(dev);
+ }
If doing it this way, which I actually like, I think it would be a little
bit better performance- and style-wise if this was written as:
if (pci_is_pcie(dev)) {
bridge = pci_upstream_bridge(dev);
retrain = !!bridge;
}
(or "retrain = bridge != NULL" if you prefer this style), and then we
don't have to repeatedly check two variables iff (pcie && !bridge) in the
loop below:
quoted hunk ↗ jump to hunk
@@ -1201,9 +1190,9 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout) } if (delay > PCI_RESET_WAIT) { - if (retrain) { + if (retrain && bridge) {
-- i.e. code can stay then as:
if (retrain) {
here. I hope you find this observation rather obvious, so will you amend
your tree, or shall I send an incremental update?
Otherwise I don't find anything suspicious with the interdiff itself
(thanks for posting it, that's really useful indeed!), but as I say I'll
yet double-check how things look and work with your tree. Hopefully
tomorrow (Thu), as I have other stuff yet to complete tonight.
Maciej