Re: [RFC PATCH v3 4/7] bus/cdx: add cdx-MSI domain with gic-its domain as parent
From: Marc Zyngier <maz@kernel.org>
Date: 2022-09-08 14:20:07
On Thu, 08 Sep 2022 10:51:01 +0100, "Radovanovic, Aleksandar" [off-list ref] wrote:
[AMD Official Use Only - General]quoted
-----Original Message----- From: Marc Zyngier <maz@kernel.org> Sent: 08 September 2022 09:08 To: Radovanovic, Aleksandar <redacted> Cc: Jason Gunthorpe <jgg@nvidia.com>; Gupta, Nipun [off-list ref]; robh+dt@kernel.org; krzysztof.kozlowski+dt@linaro.org; gregkh@linuxfoundation.org; rafael@kernel.org; eric.auger@redhat.com; alex.williamson@redhat.com; cohuck@redhat.com; Gupta, Puneet (DCG-ENG) [off-list ref]; song.bao.hua@hisilicon.com; mchehab+huawei@kernel.org; f.fainelli@gmail.com; jeffrey.l.hugo@gmail.com; saravanak@google.com; Michael.Srba@seznam.cz; mani@kernel.org; yishaih@nvidia.com; robin.murphy@arm.com; will@kernel.org; joro@8bytes.org; masahiroy@kernel.org; ndesaulniers@google.com; linux-arm- kernel@lists.infradead.org; linux-kbuild@vger.kernel.org; linux- kernel@vger.kernel.org; devicetree@vger.kernel.org; kvm@vger.kernel.org; okaya@kernel.org; Anand, Harpreet [off-list ref]; Agarwal, Nikhil [off-list ref]; Simek, Michal [off-list ref]; git (AMD-Xilinx) [off-list ref] Subject: Re: [RFC PATCH v3 4/7] bus/cdx: add cdx-MSI domain with gic-its domain as parent [CAUTION: External Email] OK, so you definitely need a mapping, but it cannot be a translation, and it needs to be in all the possible address spaces. OMG.Could you elaborate why it needs to be in all the possible address spaces? I'm in no way familiar with kernel IOVA allocation, so not sure I understand this requirement. Note that each CDX device will have its own unique StreamID (in general case, equal to DeviceID sent to the GIC), so, from a SMMU perspective, the mapping can be specific to that device. As long as that IOVA is not allocated to any DMA region for _that_ device, things should be OK? But, I appreciate it might not be that simple from a kernel perspective.
Robin answered that one. This also has direct impacts on vfio and virtualisation, as your userspace/VM cannot have any memory at the address of the ITS (because you cannot DMA to it).
quoted
quoted
quoted
quoted
As for the data part (EventID in GIC parlance), this is always going to be the CDX device-relative vector number - I believe this can't be changed, it is a hardware limitation (but I need to double-check).quoted
quoted
quoted
That should be OK, though, as I believe this is exactly what Linux would write anyway, as each CDX device should be in its own IRQ domain (i.e. have its own ITS device table).But that's really the worse part. You have hardcoded what is the *current* Linux behaviour. Things change. And baking SW behaviour into a piece of HW looks incredibly shortsighted...For posterity, I'm not an RTL designer/architect, so share your sentiment to a certain extent. That said, I expect the decision was not based on Linux or any other SW behaviour, but because it is the most straightforward and least expensive way to do it. Having an EventID register for each and every MSI source just so you can program it in any random order costs flops and all the associated complexity of extra RTL logic (think timing closure, etc.), so trade-offs are made. The fact that it matches current Linux behaviour means we just got lucky...Yeah, but that's not the only problem: there is no guarantee that we have enough LPIs to allocate for the device, so we'll perform a partial allocation (8 instead of 32 LPIs, for example).Why should that be a problem? The driver will know in advance the number of vectors required by the device. I expect it will need to call some equivalent of platform_msi_domain_alloc_irqs(), shouldn't that fail if not enough IRQs are allocated (and ultimately fail the probe)?
No, that's not how MSIs work in Linux: this is a best effort allocation, where the allocator is free to dish out the number of MSIs it can allocate at this point in time (the ITS will divide the allocation by two until it succeeds). If the end-point driver doesn't like it, it can of course decide to fail its own probe.
Even if not, we can still inform the firmware in write_msg, which will serve as an indication that a particular vector is enabled. The firmware can decide what to do with the device if not all of the vectors are enabled.
Do you anticipate that the FW will be involved at the point where the driver is finalising the device configuration? It seems that you need to come up with some sort of spec for this if there isn't one yet. M. -- Without deviation from the norm, progress is not possible. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel