Re: [PATCH v3 00/25] Arm GICv5: Host driver implementation
From: Marc Zyngier <maz@kernel.org>
Date: 2025-05-07 09:09:47
Also in:
linux-arm-kernel, lkml
On Wed, 07 May 2025 08:54:36 +0100, Lorenzo Pieralisi [off-list ref] wrote:
On Tue, May 06, 2025 at 03:05:39PM +0100, Marc Zyngier wrote:quoted
On Tue, 06 May 2025 13:23:29 +0100, Lorenzo Pieralisi [off-list ref] wrote:quoted
============= 2.5 GICv5 IWB ============= The IWB driver has been dropped owing to issues encountered with core code DOMAIN_BUS_WIRED_TO_MSI bus token handling: https://lore.kernel.org/lkml/87tt6310hu.wl-maz@kernel.org/ (local)This problem does not have much to do with DOMAIN_BUS_WIRED_TO_MSI. The issues are that: - the core code calls into the .prepare domain on a per-interrupt basis instead of on a per *device* basis. This is a complete violation of the MSI API, because .prepare is when you are supposed to perform resource reservation (in the GICv3 parlance, that's ITT allocation + MAPD command). - the same function calls .prepare for a *single* interrupt, effectively telling the irqchip "my device has only one interrupt". Because I'm super generous (and don't like wasting precious bytes), I allocate 32 LPIs at the minimum. Only snag is that I could do with 300+ interrupts, and calling repeatedly doesn't help at all, since we cannot *grow* an ITT.On the IWB driver code that I could not post I noticed that it is true that the .prepare callback is called on a per-interrupt basis but the vector size is the domain size (ie number of wires) which is correct AFAICS, so the ITT size should be fine I don't get why it would need to grow.
Look again. The only reason you are getting something that *looks* correct is that its_pmsi_prepare() has this nugget: /* Allocate at least 32 MSIs, and always as a power of 2 */ nvec = max_t(int, 32, roundup_pow_of_two(nvec)); and that the IWB is, conveniently, in sets of 32. However, the caller of this function (__msi_domain_alloc_irqs()) passes a nvec value that is always exactly *1* when allocating an interrupt. So you're just lucky that I picked a minimum ITT size that matches the IWB on your model. Configure your IWB to be, let's say, 256 interrupts and use the last one, and you'll have a very different behaviour.
The difference with this series is that on v3 LPIs are allocated on .prepare(), we allocate them on .alloc().
Absolutely not. Even on v3, we never allocate LPIs in .prepare(). We allocate the ITT, perform the MAPD, and that's it. That's why it's called *prepare*.
So yes, calling .prepare on a per-interrupt basis looks like a bug but if we allow reusing a deviceID (ie the "shared" thingy) it could be harmless.
Harmless? No. It is really *bad*. It means you lose any sort of sane tracking of what owns the ITT and how you can free things. Seeing a devid twice is the admission that we have no idea of what is going on. GICv3 is already in that sorry state, but I am hopeful that GICv5 can be a bit less crap.
quoted
So this code needs to be taken to the backyard and beaten into shape before we can make use of it. My D05 (with its collection of MBIGENs) only works by accident at the moment, as I found out yesterday, and GICv5 IWB is in the same boat, since it reuses the msi-parent thing, and therefore the same heuristic. I guess not having the IWB immediately isn't too big a deal, but I really didn't expect to find this...To be honest, it was expected. We found these snags while designing the code (that explains how IWB was structured in v1 - by the way) but we didn't know if the behaviour above was by construction, we always thought "we must be making a mistake".
Then why didn't you report it? We could have caught this very early on, before the fscked-up code was in a stable release... M. -- Without deviation from the norm, progress is not possible.