Thread (84 messages) 84 messages, 11 authors, 2022-11-17

Re: [RFC PATCH v3 4/7] bus/cdx: add cdx-MSI domain with gic-its domain as parent

From: Marc Zyngier <maz@kernel.org>
Date: 2022-09-08 14:20:07

On Thu, 08 Sep 2022 10:51:01 +0100,
"Radovanovic, Aleksandar" [off-list ref] wrote:
[AMD Official Use Only - General]


quoted
-----Original Message-----
From: Marc Zyngier <maz@kernel.org>
Sent: 08 September 2022 09:08
To: Radovanovic, Aleksandar <redacted>
Cc: Jason Gunthorpe <jgg@nvidia.com>; Gupta, Nipun
[off-list ref]; robh+dt@kernel.org;
krzysztof.kozlowski+dt@linaro.org; gregkh@linuxfoundation.org;
rafael@kernel.org; eric.auger@redhat.com; alex.williamson@redhat.com;
cohuck@redhat.com; Gupta, Puneet (DCG-ENG)
[off-list ref]; song.bao.hua@hisilicon.com;
mchehab+huawei@kernel.org; f.fainelli@gmail.com;
jeffrey.l.hugo@gmail.com; saravanak@google.com;
Michael.Srba@seznam.cz; mani@kernel.org; yishaih@nvidia.com;
robin.murphy@arm.com; will@kernel.org; joro@8bytes.org;
masahiroy@kernel.org; ndesaulniers@google.com; linux-arm-
kernel@lists.infradead.org; linux-kbuild@vger.kernel.org; linux-
kernel@vger.kernel.org; devicetree@vger.kernel.org; kvm@vger.kernel.org;
okaya@kernel.org; Anand, Harpreet [off-list ref]; Agarwal,
Nikhil [off-list ref]; Simek, Michal [off-list ref];
git (AMD-Xilinx) [off-list ref]
Subject: Re: [RFC PATCH v3 4/7] bus/cdx: add cdx-MSI domain with gic-its
domain as parent

[CAUTION: External Email]
 
OK, so you definitely need a mapping, but it cannot be a translation, and it
needs to be in all the possible address spaces. OMG.
Could you elaborate why it needs to be in all the possible address
spaces? I'm in no way familiar with kernel IOVA allocation, so not
sure I understand this requirement. Note that each CDX device will
have its own unique StreamID (in general case, equal to DeviceID
sent to the GIC), so, from a SMMU perspective, the mapping can be
specific to that device. As long as that IOVA is not allocated to
any DMA region for _that_ device, things should be OK? But, I
appreciate it might not be that simple from a kernel perspective.
Robin answered that one. This also has direct impacts on vfio and
virtualisation, as your userspace/VM cannot have any memory at the
address of the ITS (because you cannot DMA to it).
quoted
quoted
quoted
quoted
As for the data part (EventID in GIC parlance), this is always
going to be the CDX device-relative vector number - I believe this
can't be changed, it is a hardware limitation (but I need to double-
check).
quoted
quoted
quoted
That should be OK, though, as I believe this is exactly what Linux
would write anyway, as each CDX device should be in its own IRQ
domain (i.e. have its own ITS device table).
But that's really the worse part. You have hardcoded what is the
*current* Linux behaviour. Things change. And baking SW behaviour
into a piece of HW looks incredibly shortsighted...
For posterity, I'm not an RTL designer/architect, so share your
sentiment to a certain extent. That said, I expect the decision was
not based on Linux or any other SW behaviour, but because it is the
most straightforward and least expensive way to do it. Having an
EventID register for each and every MSI source just so you can program
it in any random order costs flops and all the associated complexity
of extra RTL logic (think timing closure, etc.), so trade-offs are
made. The fact that it matches current Linux behaviour means we just
got lucky...
Yeah, but that's not the only problem: there is no guarantee that we have
enough LPIs to allocate for the device, so we'll perform a partial allocation (8
instead of 32 LPIs, for example).
Why should that be a problem? The driver will know in advance the
number of vectors required by the device. I expect it will need to
call some equivalent of platform_msi_domain_alloc_irqs(), shouldn't
that fail if not enough IRQs are allocated (and ultimately fail the
probe)?
No, that's not how MSIs work in Linux: this is a best effort
allocation, where the allocator is free to dish out the number of MSIs
it can allocate at this point in time (the ITS will divide the
allocation by two until it succeeds). If the end-point driver doesn't
like it, it can of course decide to fail its own probe.
Even if not, we can still inform the firmware in write_msg,
which will serve as an indication that a particular vector is
enabled. The firmware can decide what to do with the device if not
all of the vectors are enabled.
Do you anticipate that the FW will be involved at the point where the
driver is finalising the device configuration? It seems that you need
to come up with some sort of spec for this if there isn't one yet.

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help