Re: [PATCH v3 00/25] Arm GICv5: Host driver implementation
From: Lorenzo Pieralisi <lpieralisi@kernel.org>
Date: 2025-05-07 10:01:35
Also in:
linux-arm-kernel, lkml
On Wed, May 07, 2025 at 10:09:44AM +0100, Marc Zyngier wrote:
On Wed, 07 May 2025 08:54:36 +0100, Lorenzo Pieralisi [off-list ref] wrote:quoted
On Tue, May 06, 2025 at 03:05:39PM +0100, Marc Zyngier wrote:quoted
On Tue, 06 May 2025 13:23:29 +0100, Lorenzo Pieralisi [off-list ref] wrote:quoted
============= 2.5 GICv5 IWB ============= The IWB driver has been dropped owing to issues encountered with core code DOMAIN_BUS_WIRED_TO_MSI bus token handling: https://lore.kernel.org/lkml/87tt6310hu.wl-maz@kernel.org/ (local)This problem does not have much to do with DOMAIN_BUS_WIRED_TO_MSI. The issues are that: - the core code calls into the .prepare domain on a per-interrupt basis instead of on a per *device* basis. This is a complete violation of the MSI API, because .prepare is when you are supposed to perform resource reservation (in the GICv3 parlance, that's ITT allocation + MAPD command). - the same function calls .prepare for a *single* interrupt, effectively telling the irqchip "my device has only one interrupt". Because I'm super generous (and don't like wasting precious bytes), I allocate 32 LPIs at the minimum. Only snag is that I could do with 300+ interrupts, and calling repeatedly doesn't help at all, since we cannot *grow* an ITT.On the IWB driver code that I could not post I noticed that it is true that the .prepare callback is called on a per-interrupt basis but the vector size is the domain size (ie number of wires) which is correct AFAICS, so the ITT size should be fine I don't get why it would need to grow.Look again. The only reason you are getting something that *looks* correct is that its_pmsi_prepare() has this nugget: /* Allocate at least 32 MSIs, and always as a power of 2 */ nvec = max_t(int, 32, roundup_pow_of_two(nvec)); and that the IWB is, conveniently, in sets of 32. However, the caller of this function (__msi_domain_alloc_irqs()) passes a nvec value that is always exactly *1* when allocating an interrupt.
nvec is one but this does not work for the reason above, it works because of AFAICS (for the IWB set-up I have): msi_info = msi_get_domain_info(domain); if (msi_info->hwsize > nvec) nvec = msi_info->hwsize;
So you're just lucky that I picked a minimum ITT size that matches the IWB on your model.
Not really, we test with wires above 32, we end up calling .prepare() with the precise number of wires, don't know why that does not work for the MBIgen (possibly because the interrupt-controller platform devices are children of the "main" MBIgen platform device ? The IWB one is created by OF code, MBIgen has to create children, maybe that's what is going wrong with the device/domain hierarchy ?).
Configure your IWB to be, let's say, 256 interrupts and use the last one, and you'll have a very different behaviour.
See above.
quoted
The difference with this series is that on v3 LPIs are allocated on .prepare(), we allocate them on .alloc().Absolutely not. Even on v3, we never allocate LPIs in .prepare(). We allocate the ITT, perform the MAPD, and that's it. That's why it's called *prepare*.
I supposed that's what its_lpi_alloc() does in its_create_device() but OK, won't mention that any further.
quoted
So yes, calling .prepare on a per-interrupt basis looks like a bug but if we allow reusing a deviceID (ie the "shared" thingy) it could be harmless.Harmless? No. It is really *bad*. It means you lose any sort of sane tracking of what owns the ITT and how you can free things. Seeing a devid twice is the admission that we have no idea of what is going on. GICv3 is already in that sorry state, but I am hopeful that GICv5 can be a bit less crap.
Well, GICv5 will have to cope with designs, hopefully deviceIDs sharing is a thing of the past I am not eulogizing the concept :)
quoted
quoted
So this code needs to be taken to the backyard and beaten into shape before we can make use of it. My D05 (with its collection of MBIGENs) only works by accident at the moment, as I found out yesterday, and GICv5 IWB is in the same boat, since it reuses the msi-parent thing, and therefore the same heuristic. I guess not having the IWB immediately isn't too big a deal, but I really didn't expect to find this...To be honest, it was expected. We found these snags while designing the code (that explains how IWB was structured in v1 - by the way) but we didn't know if the behaviour above was by construction, we always thought "we must be making a mistake".Then why didn't you report it? We could have caught this very early on, before the fscked-up code was in a stable release...
We spotted it late March - planned to discuss the IWB design while reviewing v5. Thanks, Lorenzo