Re: [PATCH v1] drivers: pci: introduce configurable delay for Rockchip PCIe bus scan | linux-kernel-mentees

quoted

On Sat, May 13, 2023 at 07:40:12AM -0400, Peter Geis wrote:
On Fri, May 12, 2023 at 9:24 PM Bjorn Helgaas [off-list ref] wrote:
[+cc ARM64 folks, in case you have abort handling tips; thread at:
https://lore.kernel.org/r/20230509153912.515218-1-vincenzopalazzodev@gmail.com (local)]

Pine64 RockPro64 panics while enumerating some PCIe devices.  Adding a
delay avoids the panic.  My theory is a PCIe Request Retry Status to a
Vendor ID config read causes an abort that we don't handle.

On Tue, May 09, 2023 at 05:39:12PM +0200, Vincenzo Palazzo wrote:
...
[    1.229856] SError Interrupt on CPU4, code 0xbf000002 -- SError
[    1.229860] CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.9.9-2.0-MANJARO-ARM
#1
[    1.229862] Hardware name: Pine64 RockPro64 v2.1 (DT)
[    1.229864] pstate: 60000085 (nZCv daIf -PAN -UAO BTYPE=--)
[    1.229866] pc : rockchip_pcie_rd_conf+0xb4/0x270
[    1.229868] lr : rockchip_pcie_rd_conf+0x1b4/0x270
...
[    1.229939] Kernel panic - not syncing: Asynchronous SError Interrupt
...
[    1.229955]  nmi_panic+0x8c/0x90
[    1.229956]  arm64_serror_panic+0x78/0x84
[    1.229958]  do_serror+0x15c/0x160
[    1.229960]  el1_error+0x84/0x100
[    1.229962]  rockchip_pcie_rd_conf+0xb4/0x270
[    1.229964]  pci_bus_read_config_dword+0x6c/0xd0
[    1.229966]  pci_bus_generic_read_dev_vendor_id+0x34/0x1b0
[    1.229968]  pci_scan_single_device+0xa4/0x144
On Fri, May 12, 2023 at 12:46:21PM +0200, Vincenzo Palazzo wrote:
... Is there any way to tell the kernel "hey we need some more time
here"?
We enumerate PCI devices by trying to read the Vendor ID of every
possible device address (see pci_scan_slot()).  On PCIe, if a device
doesn't exist at that address, the Vendor ID config read will be
terminated with Unsupported Request (UR) status.  This is normal
and happens every time we enumerate devices.

The crash doesn't happen every time we enumerate, so I don't think
this UR is the problem.  Also, if it *were* the problem, adding a
delay would not make any difference.
Is this behavior different if there is a switch device forwarding on
the UR? On rk3399 switches are completely non-functional because of
the panic, which is observed in the output of the dmesg in [2] with
the hack patch enabled. Considering what you just described it looks
like the forwarded UR for each non-existent device behind the switch
is causing an serror.
I don't know exactly what the panic looks like, but I wouldn't expect
UR handling to be different when there's a switch.

pcie-rockchip-host.c does handle devices on the root bus (00)
differently than others because rockchip_pcie_valid_device() knows
that device 00:00 is the only device on the root bus.  That part makes
sense because 00:00 is built into the SoC.

I'm a little suspicious of the fact that rockchip_pcie_valid_device()
also enforces that bus 01 can only have a single device on it.  No
other *_pcie_valid_device() implementations enforce that.  It's true
that traditional PCIe devices can only implement device 00, but ARI
relaxes that by reusing the Device Number as extended Function Number
bits.

There *is* a way for a PCIe device to say "I need more time".  It does
this by responding to that Vendor ID config read with Request Retry
Status (RRS, aka CRS in older specs), which means "I'm not ready yet,
but I will be ready in the future."  Adding a delay would definitely
make a difference here, so my guess is this is what's happening.

Most root complexes return ~0 data to the CPU when a config read
terminates with UR or RRS.  It sounds like rockchip does this for UR
but possibly not for RRS.

There is a "RRS Software Visibility" feature, which is supposed to
turn the RRS into a special value (Vendor ID == 0x0001), but per [1],
rockchip doesn't support it (lspci calls it "CRSVisible").

But the CPU load instruction corresponding to the config read has to
complete by reading *something* or else be aborted.  It sounds like
it's aborted in this case.  I don't know the arm64 details, but if we
could catch that abort and determine that it was an RRS and not a UR,
maybe we could fabricate the magic RRS 0x0001 value.

imx6q_pcie_abort_handler() does something like that, although I think
it's for arm32, not arm64.  But obviously we already catch the abort
enough to dump the register state and panic, so maybe there's a way to
extend that?
Perhaps a hook mechanism that allows drivers to register with the
serror handler and offer to handle specific errors before the generic
code causes the system panic?

Very Respectfully,
Peter Geis

[2] https://lore.kernel.org/linux-pci/CAMdYzYqn3L7x-vc+_K6jG0EVTiPGbz8pQ-N1Q1mRbcVXE822Yg@mail.gmail.com/ (local)

Bjorn

[1] https://lore.kernel.org/linux-pci/CAMdYzYpOFAVq30N+O2gOxXiRtpoHpakFg3LKq3TEZq4S6Y0y0g@mail.gmail.com/ (local)
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help