Thread (15 messages) 15 messages, 3 authors, 2018-05-04

LS1043A : "synchronous abort" at boot due to PCI config read

From: Gilles Buloz <hidden>
Date: 2018-05-02 12:57:31
Also in: linux-pci

Le 30/04/2018 19:53, Gilles BULOZ a ?crit :
Le 30/04/2018 19:04, Bjorn Helgaas a ?crit :
quoted
On Mon, Apr 30, 2018 at 01:36:53PM +0000, Gilles Buloz wrote:
quoted
Le 30/04/2018 10:46, Gilles BULOZ a ?crit :
quoted
Le 27/04/2018 18:56, Bjorn Helgaas a ?crit :
quoted
On Fri, Apr 27, 2018 at 12:29:32PM +0000, Gilles Buloz wrote:
quoted
Le 27/04/2018 10:43, Ard Biesheuvel a ?crit :
quoted
(add Bjorn and linux-pci)

On 13 April 2018 at 19:32, Gilles Buloz [off-list ref] wrote:
quoted
Dear developers,

I currently have two functional workarounds for this issue but
would like to know which one you would recommend, if any :-) I'm
using a LS1043A CPU (NXP QorIQ Layerscape) and get a "synchronous
external abort" when booting because of a PCI config read during
PCI scan.

I'm using a custom hardware (based on LS1043ARDB) having a
PEX8112 PCIe-to-PCI bridge connected to the LS1043A to have a PCI
slot for legacy devices. This bridge only supports PCI-Compatible
config accesses (offset 0x00-0xFF).
I would guess the PEX8112 itself has 4K of config space, but it only
forwards 256 bytes of config space to the conventional PCI secondary
bus.
quoted
quoted
quoted
On this PCI slot I connect a PCI module made of a PCI-to-PCIe
bridge plus PCIe devices behind.

The problem occurs when the kernel probes the PCIe devices : as
they are PCIe devices, the kernel does a PCI config read access
at offset 0x100 to check if "PCIe extended capability registers"
are accessible (see drivers/pci/probe.c, function
pci_cfg_space_size_ext()). Unfortunately the PEX8112 PCIe-to-PCI
bridge that is in the path reports an error to the CPU for this
access, and it seems there's no way to disable that on this
bridge.

The first workaround I found was to patch
drivers/pci/host/pci-layerscape.c to have PCIE_ABSERR_SETTING set
to 0x9400 instead of 0x9401 (for PCIE_ABSERR register) to disable
error reporting. This only impacts an NXP part of the Linux
kernel code, but I'm not sure this is a good idea (however it
seems to be like that on Intel platforms where even MEM accesses
to a no-device address return FF without any error).

I've also tried another workaround that works : patch
drivers/pci/probe.c to use bus_flags to remember if a bus is
behind a bridge without extended address capability, to avoid PCi
config read accesses at offset 0x100 in pci_cfg_space_size() /
pci_cfg_space_size_ext(). But this patch impacts the generic PCI
probe method of Linux.

Any Idea to properly handle that issue ?
This seems like a rather unusual configuration, but I guess that
if the first bridge/switch advertises its inability to support
extended config space accesses, we should not be performing them
on any of its subordinate buses. How does the PEX8112 advertise
this limitation?

That said, I wonder if it is reasonable in the first place to
expect that a PCIe device works as expected passing through a
legacy PCI layer like that.
The PEX8112 PCIe-to-PCI bridge has capability PCI_CAP_ID_EXP, but
has no PCI_CAP_ID_PCIX capability.  As I understand the lack of
PCI_CAP_ID_PCIX is advertising this limitation on the PCI side (no
support for PCI config offset >=0x100).
Sounds right to me.
quoted
Also I guess in the case of a bridge having PCI_CAP_ID_PCIX, this
limitation would be advertised by the lack of PCI_X_STATUS_266MHZ
and PCI_X_STATUS_533MHZ (as done in drivers/pci/probe.c at
pci_cfg_space_size())
Also sounds right.  Per the PCI-X spec, checking for PCI_X_STATUS_266MHZ
should be enough, but it shouldn't hurt to check for either
PCI_X_STATUS_266MHZ or PCI_X_STATUS_533MHZ.
quoted
I'm currently using the attached patch (for kernel 4.1.35-rt41 from
NXP Yocto BSP). It uses bus_flags to remember if a bus is behind a
bridge without extended address capability to avoid PCi config
accesses at offset >= 0x100. Thanks to this patch I now have a
functional system with functional PCI/PCIe devices.
The patch seems like it's looking at the right things, but I don't
want to build it into pci_scan_bridge_extend() because that function
is much too complicated already.

I'd rather build it into pci_cfg_space_size() or
pci_cfg_space_size_ext() somehow.  Maybe something along these lines?
This doesn't account for the case of a PCIe-to-PCI-X Mode 2 bridge; in
that case, I think all 4K would be accessible on the PCI-X side.
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ac91b6fd0bcd..d8b091f0bcd1 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1367,7 +1367,7 @@ static bool pci_ext_cfg_is_aliased(struct pci_dev *dev)
    * pci_cfg_space_size - Get the configuration space size of the PCI device
    * @dev: PCI device
    *
- * Regular PCI devices have 256 bytes, but PCI-X 2 and PCI Express devices
+ * Regular PCI devices have 256 bytes, but PCI-X Mode 2 and PCI Express devices
    * have 4096 bytes.  Even if the device is capable, that doesn't mean we can
    * access it.  Maybe we don't have a way to generate extended config space
    * accesses, or the device is behind a reverse Express bridge.  So we try
@@ -1376,9 +1376,14 @@ static bool pci_ext_cfg_is_aliased(struct pci_dev *dev)
    */
   static int pci_cfg_space_size_ext(struct pci_dev *dev)
   {
+    struct pci_dev *bridge = pci_upstream_bridge(dev);
       u32 status;
       int pos = PCI_CFG_SPACE_SIZE;
   +    if (bridge && pci_is_pcie(bridge) &&
+        pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE)
+        return PCI_CFG_SPACE_SIZE;
+
       if (pci_read_config_dword(dev, pos, &status) != PCIBIOS_SUCCESSFUL)
           return PCI_CFG_SPACE_SIZE;
       if (status == 0xffffffff || pci_ext_cfg_is_aliased(dev))
quoted
--- include/linux/pci.h.orig 2018-03-26 16:51:18.050000000 +0000
+++ include/linux/pci.h    2018-03-26 16:51:27.660000000 +0000
@@ -193,6 +193,7 @@
   enum pci_bus_flags {
       PCI_BUS_FLAGS_NO_MSI   = (__force pci_bus_flags_t) 1,
       PCI_BUS_FLAGS_NO_MMRBC = (__force pci_bus_flags_t) 2,
+    PCI_BUS_FLAGS_COMPAT_CFG_SPACE = (__force pci_bus_flags_t) 4,
   };
     /* These values come from the PCI Express Spec */
--- drivers/pci/probe.c.orig    2018-01-22 09:29:52.000000000 +0000
+++ drivers/pci/probe.c    2018-03-26 16:54:30.830000000 +0000
@@ -827,6 +827,28 @@
               child->primary = primary;
               pci_bus_insert_busn_res(child, secondary, subordinate);
               child->bridge_ctl = bctl;
+
+            {
+                int pos;
+                u32 status;
+                bool pci_compat_cfg_space = false;
+
+                if (!pci_is_pcie(dev) || (pci_pcie_type(dev) == PCI_EXP_TYPE_PCIE_BRIDGE) || (pci_pcie_type(dev) ==
PCI_EXP_TYPE_PCI_BRIDGE)) {
+                    /* for PCI/PCI bridges, or PCIe/PCI bridge in forward or reverse mode, we have to check for PCI-X
capabilities */
+                    pos = pci_find_capability(dev, PCI_CAP_ID_PCIX);
+                    if (pos) {
+                        pci_read_config_dword(dev, pos + PCI_X_STATUS, &status);
+                        if (!(status & (PCI_X_STATUS_266MHZ | PCI_X_STATUS_533MHZ)))
+                            pci_compat_cfg_space = true;
+                    } else {
+                        pci_compat_cfg_space = true;
+                    }
+                    if (pci_compat_cfg_space) {
+                        dev_info(&dev->dev, "[%04x:%04x] Child bus limited to PCI-Compatible config space\n", dev->vendor,
dev->device);
+                        child->bus_flags |= PCI_BUS_FLAGS_COMPAT_CFG_SPACE;
+                    }
+                }
+            }
           }
             cmax = pci_scan_child_bus(child);
@@ -1098,6 +1120,11 @@
               goto fail;
       }
   +    if (dev->bus->bus_flags & PCI_BUS_FLAGS_COMPAT_CFG_SPACE) {
+        dev_info(&dev->dev, "[%04x:%04x] PCI-Compatible config space only due to parent bus(es)\n", dev->vendor, dev->device);
+        return PCI_CFG_SPACE_SIZE;
+    }
+
       return pci_cfg_space_size_ext(dev);
      fail:
Bjorn,
If I'm right about your proposed patch to
pci_cfg_space_size_ext(), *bridge is pointing to the upper device
of device *dev being checked. I understand the purpose, but I
think this fails for my config that is :

LS1043 PCIe root -> PEX8112 PCIe-to-PCI bridge -> PMC slot connector -> PCI-to-PCIe bridge -> PCIe switch (4 ports) -> 4 PCIe
devices (one on each port)

because :
- when pci_cfg_space_size_ext() is run on the 4 PCIe devices,
*bridge is the PCIe switch which is not matching
PCI_EXP_TYPE_PCI_BRIDGE. In this case *bridge should also be
checked for the parent bus of the PCIe switch, and so on.
- when pci_cfg_space_size_ext() is run for the PCI-to-PCIe bridge,
*bridge is the PEX8112 that is also not matching
PCI_EXP_TYPE_PCI_BRIDGE but PCI_EXP_TYPE_PCIE_BRIDGE. This leads
to a config access at offset 0x100 to the PCI-to-PCIe bridge, so a
crash (because of the PEX8112)

I think setting a bit in bus_flags when creating a child bus is
very efficient because once set it is automatically inherited by
all child buses and then the only thing that pci_cfg_space_size()
has to do for each device is to check for this bit. Also this
PCI_BUS_FLAGS_COMPAT_CFG_SPACE flag is actually a bus property
that is compliant with the purpose of bus_flags.
Yeah, it needs to be inherited somehow, and I don't like the idea of
traversing up the tree, so I prefer your idea.  Although I don't
actually see the inheritance in the patch below -- I thought there
would be something like this:

   dev = bus->self;
   parent_bus = dev->bus;
   if (parent_bus && parent_bus->bus_flags & PCI_BUS_FLAGS_COMPAT_CFG_SPACE)
     bus->bus_flags |= PCI_BUS_FLAGS_COMPAT_CFG_SPACE;

pci_scan_bridge_extend() calls pci_add_new_bus() from two places.  You
added a call to pci_bus_check_compat_cfg_space() at one of them, and
it's not obvious why we wouldn't need it at the other place, too.

Can you set this up in pci_alloc_child_bus()?  If you can put it
there, it would be clear that every time we allocate a secondary bus,
we figure out whether extended config space is accessible on that bus.

That doesn't cover the root bus case, where we currently assume the
host bridge can generate config accesses to all config space supported
by devices on the root bus.  But we don't have a problem there, so I
guess we don't need to worry about it now.

If you can put it in pci_alloc_child_bus(), could you make your new
function return a boolean, e.g., pci_bus_ext_cfg_accessible(), or
similar, and then use the result to set the
PCI_BUS_FLAGS_COMPAT_CFG_SPACE flag?  Names like "*_check_*()" don't
tell the reader much about what's happening.
quoted
quoted
I agree that pci_scan_bridge_extend() is already too complicated,
so would you be okay to only add one line to it :
   pci_bus_set_compat_cfg_space(child);
and put all the code I suggested in this new function
pci_bus_set_compat_cfg_space() ? (also supporting PCI-X Mode 2
devices)

Improvement : this function can return immediately if the child
bus has already inherited the flag from its parent.
I mean something like the attached patch I tested this morning...
Sorry, this is for old kernel 4.1.35 but just to clarify what I
propose (also applies to 4.16.6 by changing value of
PCI_BUS_FLAGS_COMPAT_CFG_SPACE in pci.h to 8).
--- include/linux/pci.h.orig    2018-03-26 16:51:18.050000000 +0000
+++ include/linux/pci.h    2018-04-30 09:50:57.660000000 +0000
@@ -193,6 +193,7 @@
  enum pci_bus_flags {
      PCI_BUS_FLAGS_NO_MSI   = (__force pci_bus_flags_t) 1,
      PCI_BUS_FLAGS_NO_MMRBC = (__force pci_bus_flags_t) 2,
+    PCI_BUS_FLAGS_COMPAT_CFG_SPACE = (__force pci_bus_flags_t) 4,
  };
    /* These values come from the PCI Express Spec */
--- drivers/pci/probe.c.orig    2018-01-22 09:29:52.000000000 +0000
+++ drivers/pci/probe.c    2018-04-30 13:29:50.600000000 +0000
@@ -754,6 +754,35 @@
                       PCI_EXP_RTCTL_CRSSVE);
  }
  +static void pci_bus_check_compat_cfg_space(struct pci_bus *bus)
+{
+    struct pci_dev *dev = bus->self;
+    bool pci_compat_cfg_space = false;
+    int pos;
+    u32 status;
+
+    if (bus->bus_flags & PCI_BUS_FLAGS_COMPAT_CFG_SPACE)
+        return;
+
+    if (!pci_is_pcie(dev) || /* PCI/PCI bridge */
+        (pci_pcie_type(dev) == PCI_EXP_TYPE_PCIE_BRIDGE) || /* PCIe/PCI bridge in forward mode */
+        (pci_pcie_type(dev) == PCI_EXP_TYPE_PCI_BRIDGE)) { /* PCIe/PCI bridge in reverse mode */
+        pos = pci_find_capability(dev, PCI_CAP_ID_PCIX);
+        if (pos) {
+            pci_read_config_dword(dev, pos + PCI_X_STATUS, &status);
+            if (!(status & (PCI_X_STATUS_266MHZ | PCI_X_STATUS_533MHZ)))
+                pci_compat_cfg_space = true;
+        } else {
+            pci_compat_cfg_space = true;
+        }
+        if (pci_compat_cfg_space) {
+            dev_info(&dev->dev, "bus %02x limited to PCI-Compatible config space\n",
+                 bus->number);
+            bus->bus_flags |= PCI_BUS_FLAGS_COMPAT_CFG_SPACE;
+        }
+    }
+}
+
  /*
   * If it's a bridge, configure it and scan the bus behind it.
   * For CardBus bridges, we don't scan behind as the devices will
@@ -827,6 +856,7 @@
              child->primary = primary;
              pci_bus_insert_busn_res(child, secondary, subordinate);
              child->bridge_ctl = bctl;
+            pci_bus_check_compat_cfg_space(child);
          }
            cmax = pci_scan_child_bus(child);
@@ -1084,6 +1114,9 @@
      u32 status;
      u16 class;
  +    if (dev->bus->bus_flags & PCI_BUS_FLAGS_COMPAT_CFG_SPACE)
+        return PCI_CFG_SPACE_SIZE;
+
      class = dev->class >> 8;
      if (class == PCI_CLASS_BRIDGE_HOST)
          return pci_cfg_space_size_ext(dev);
The inheritence is made by this line in pci_alloc_child_bus() :
    child->bus_flags = parent->bus_flags;
So once we detect a limitation on a bridge impacting a child bus and that we set the flag in child->bus_flags, this flag is 
automatically present in the child->bus_flags of all its children buses.

I agree with your remarks and will create a function named pci_bus_check_compat_cfg_space() that will be called from 
pci_alloc_child_bus().
I'll test that on Wednesday 2th and will give you my feedback.
Hi Bjorn,
See attached patch (tested ok this morning)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cfgspace3_4.1.35.patch
Type: text/x-patch
Size: 2246 bytes
Desc: cfgspace3_4.1.35.patch
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180502/bd83d6f9/attachment.bin>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help