Thread (44 messages) 44 messages, 6 authors, 2024-06-24

Re: [PATCH v5 08/12] PCI: imx6: Config look up table(LUT) to support MSI ITS and IOMMU for i.MX95

From: Frank Li <Frank.li@nxp.com>
Date: 2024-06-24 15:00:43
Also in: bpf, imx, linux-devicetree, linux-pci, lkml

On Sat, Jun 22, 2024 at 12:38:49PM -0500, Bjorn Helgaas wrote:
On Fri, Jun 21, 2024 at 05:43:21PM -0500, Bjorn Helgaas wrote:
quoted
On Fri, Jun 21, 2024 at 06:29:48PM -0400, Frank Li wrote:
quoted
On Mon, Jun 17, 2024 at 10:26:36AM -0400, Frank Li wrote:
quoted
On Thu, Jun 13, 2024 at 05:41:25PM -0500, Bjorn Helgaas wrote:
quoted
On Thu, Jun 06, 2024 at 04:24:17PM -0400, Frank Li wrote:
quoted
On Mon, Jun 03, 2024 at 04:07:55PM -0400, Frank Li wrote:
quoted
On Mon, Jun 03, 2024 at 01:56:27PM -0500, Bjorn Helgaas wrote:
quoted
On Mon, Jun 03, 2024 at 02:42:45PM -0400, Frank Li wrote:
quoted
On Mon, Jun 03, 2024 at 12:19:21PM -0500, Bjorn Helgaas wrote:
quoted
On Fri, May 31, 2024 at 03:58:49PM +0100, Robin Murphy wrote:
quoted
On 2024-05-31 12:08 am, Bjorn Helgaas wrote:
quoted
[+cc IOMMU and pcie-apple.c folks for comment]

On Tue, May 28, 2024 at 03:39:21PM -0400, Frank Li wrote:
quoted
For the i.MX95, configuration of a LUT is necessary to convert Bus Device
Function (BDF) to stream IDs, which are utilized by both IOMMU and ITS.
This involves examining the msi-map and smmu-map to ensure consistent
mapping of PCI BDF to the same stream IDs. Subsequently, LUT-related
registers are configured. In the absence of an msi-map, the built-in MSI
controller is utilized as a fallback.

Additionally, register a PCI bus notifier to trigger imx_pcie_add_device()
upon the appearance of a new PCI device and when the bus is an iMX6 PCI
controller. This function configures the correct LUT based on Device Tree
Settings (DTS).
This scheme is pretty similar to apple_pcie_bus_notifier().  If we
have to do this, I wish it were *more* similar, i.e., copy the
function names, bitmap tracking, code structure, etc.

I don't really know how stream IDs work, but I assume they are used on
most or all arm64 platforms, so I'm a little surprised that of all the
PCI host drivers used on arm64, only pcie-apple.c and pci-imx6.c need
this notifier.
This is one of those things that's mostly at the mercy of the PCIe root
complex implementation. Typically the SMMU StreamID and/or GIC ITS DeviceID
is derived directly from the PCI RID, sometimes with additional high-order
bits hard-wired to disambiguate PCI segments. I believe this RID-translation
LUT is a particular feature of the the Synopsys IP - I know there's also one
on the NXP Layerscape platforms, but on those it's programmed by the
bootloader, which also generates the appropriate "msi-map" and "iommu-map"
properties to match. Ideally that's what i.MX should do as well, but hey.
Maybe this RID-translation is a feature of i.MX, not of Synopsys?  I
see that the LUT CSR accesses use IMX95_* definitions.
Yes, it convert 16bit RID to 6bit stream id.
IIUC, you're saying this is not a Synopsys feature, it's an i.MX
feature.
Yes, it is i.MX feature. But I think other vendor should have similar
situation if use old arm smmu.
quoted
quoted
quoted
quoted
If it's really necessary to do this programming from Linux, then there's
still no point in it being dynamic - the mappings cannot ever change, since
the rest of the kernel believes that what the DT said at boot time was
already a property of the hardware. It would be a lot more logical, and
likely simpler, for the driver to just read the relevant map property and
program the entire LUT to match, all in one go at controller probe time.
Rather like what's already commonly done with the parsing of "dma-ranges" to
program address-translation LUTs for inbound windows.

Plus that would also give a chance of safely dealing with bad DTs specifying
invalid ID mappings (by refusing to probe at all). As it is, returning an
error from a child's BUS_NOTIFY_ADD_DEVICE does nothing except prevent any
further notifiers from running at that point - the device will still be
added, allowed to bind a driver, and able to start sending DMA/MSI traffic
without the controller being correctly programmed, which at best won't work
and at worst may break the whole system.
Frank, could the imx LUT be programmed once at boot-time instead of at
device-add time?  I'm guessing maybe not because apparently there is a
risk of running out of LUT entries?
It is not good idea to depend on boot loader so much.
I meant "could this be programmed once when the Linux imx host
controller driver is probed?"  But from the below, it sounds like
that's not possible in general because you don't have enough stream
IDs to do that.
Oh! sorry miss understand what your means. It is possible like what I did
at v3 version. But I think it is not good enough.
quoted
quoted
Some hot plug devics
(SD7.0) may plug after system boot. Two PCIe instances shared one set
of 6bits stream id (total 64). Assume total 16 assign to two PCIe
controllers. each have 8 stream id. If use uboot assign it static, each
PCIe controller have below 8 devices.  It will be failrue one controller
connect 7, another connect 9. but if dynamtic alloc when devices add, both
controller can work.

Although we have not so much devices now,  this way give us possility to
improve it in future.
quoted
It sounds like the consequences of running out of LUT entries are
catastrophic, e.g., memory corruption from mis-directed DMA?  If
that's possible, I think we need to figure out how to prevent the
device from being used, not just dev_warn() about it.
Yes, but so far, we have not met such problem now. We can improve it when
we really face such problem.
If this controller can only support DMA from a limited number of
endpoints below it, I think we should figure out how to enforce that
directly.  Maybe we can prevent drivers from enabling bus mastering or
something.  I'm not happy with the idea of waiting for and debugging a
report of data corruption.
It may add a pre-add hook function to pci bridge. let me do more research.
Hi Bjorn:

int pci_setup_device(struct pci_dev *dev)
{
	dev->error_state = pci_channel_io_normal;
	...
	pci_fixup_device(pci_fixup_early, dev);

	^^^ I can add fixup hook for pci_fixup_early. If not resource,
I can set dev->error_state to pci_channel_io_frozen or
pci_channel_io_perm_failure

	And add below check here after call hook function.

	if (dev->error_state != pci_channel_io_normal)
		return -EIO;

}

How do you think this method? If you agree, I can continue search device
remove hook up.
I think this would mean the device would not appear to be enumerated
at all, right?  I.e., it wouldn't show up in lspci?  And we couldn't
use even a pure programmed IO driver with no DMA or MSI?
Make sense. Let me do more research on this.

Frank
quoted
I wonder if we should have a function pointer in struct
pci_host_bridge, kind of like the existing ->map_irq(), where we could
do host bridge-specific setup when enumerating a PCI device.
Consider some device may no use MSI or DMA. It'd better set LUT when
allocate msi irq. I think insert a irq-domain in irq hierarchy.

static const struct irq_domain_ops lut_pcie_msi_domain_ops = {
        .alloc  = lut_pcie_irq_domain_alloc,
        .free   = lut_pcie_irq_domain_free,
};

int dw_pcie_allocate_domains(struct dw_pcie_rp *pp)
{
        struct fwnode_handle *fwnode = of_node_to_fwnode(pci->dev->of_node);

        pp->irq_domain = irq_domain_create_hierarchy(...)

        pp->msi_domain = pci_msi_create_irq_domain(...);

        return 0;
}

Manage lut stream id in lut_pcie_irq_domain_alloc() and
lut_pcie_irq_domain_free().

So failure happen only when driver use MSI and no-stream ID avaiable. It
should be better than failure when add devices. Some devices may not use
at all.
I'm not an IRQ expert, but it sounds plausible.  There might even be
an opportunity to fall back to INTx if there's no stream ID available
for MSI?
Sorry, I think this was a half-baked thought.  Exhaustion of stream
IDs should be an uncommon situation, and the important thing is to
prevent terrible things from happening.

I don't think it's worth bending over backwards to make everything
possible limp along.  If it's easy to just make the device
inaccessible, that's fine.  If there's a simple way to make it
available but keep from enabling bus mastering, we could do that too,
but only if it's really simple.
Mani mentions qcom use simple method to config all lut when probe at
qcom_pcie_config_sid_1_9_0(). It is similar with my v3 version.

https://lore.kernel.org/imx/20240402-pci2_upstream-v3-8-803414bdb430@nxp.com/ (local)

Of course both have not resolved run-out sid problems. But genenerally,
one PCIE slot only connect one devices. static alloc 8/16 sid are enough
for 99.9% user case.

best regards
Frank Li



Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help