[PATCH v4 15/19] arm/arm64: KVM: add virtual GICv3 distributor emulation
From: andre.przywara@arm.com (Andre Przywara)
Date: 2014-12-03 10:47:34
On 03/12/14 10:30, Christoffer Dall wrote:
On Tue, Dec 02, 2014 at 05:32:45PM +0000, Andre Przywara wrote:quoted
On 02/12/14 17:06, Marc Zyngier wrote:quoted
On 02/12/14 16:24, Andre Przywara wrote:quoted
Hej Christoffer, On 30/11/14 08:30, Christoffer Dall wrote:quoted
On Fri, Nov 28, 2014 at 03:24:11PM +0000, Andre Przywara wrote:quoted
Hej Christoffer, On 25/11/14 10:41, Christoffer Dall wrote:quoted
Hi Andre, On Mon, Nov 24, 2014 at 04:00:46PM +0000, Andre Przywara wrote:[...]quoted
quoted
quoted
quoted
quoted
quoted
+ + if (!is_in_range(mmio->phys_addr, mmio->len, rdbase, + GIC_V3_REDIST_SIZE * nrcpus)) + return false;Did you think more about the contiguous allocation issue here or can you give me a pointer to the requirement in the spec?5.4.1 Re-Distributor AddressingSection 5.4.1 talks about the pages within a single re-distributor having to be contiguous, not all the re-deistributor regions having to be contiguous, right?Ah yes, you are right. But I still think it does not matter: 1) We are "implementing" the GICv3. So as the spec does not forbid this, we just state that the redistributor register maps for each VCPU are contiguous. Also we create the FDT accordingly. I will add a comment in the documentation to state this. 2) The kernel's GICv3 DT bindings assume this allocation is the default. Although Marc added bindings to work around this (stride), it seems much more logical to me to not use it.I don't disagree (and never have) with the fact that it is up to us to decide. My original question, which we haven't talked about yet, is if it is *reasonable* to assume that all re-distributor regions will always be contiguous? How will you handle VCPU hotplug for example?As kvmtool does not support hotplug, I haven't thought about this yet. To me it looks like userland should just use maxcpus for the allocation. If I get the current QEMU code right, there is room for 127 GICv3 VCPUs (2*64K per VCPU + 64K for the distributor in 16M space) at the moment. Kvmtool uses a different mapping, which allows to share 1G with virtio, so the limit is around 8000ish VCPUs here. Are there any issues with changing the QEMU virt mapping later? Migration, maybe? If the UART, the RTC and the virtio regions are moved more towards the beginning of the 256MB PCI mapping, then there should be space for a bit less than 1024 VCPUs, if I get this right.quoted
Where in the guest physical memory map of our various virt machines should these regions sit so that we can allocate anough re-distributors for VCPUs etc.?Various? Are there other mappings than those described in hw/arm/virt.c?quoted
I just want to make sure we're not limiting ourselves by some amount of functionality or ABI (redistributor base addresses) that will be hard to expand in the future.If we are flexible with the mapping at VM creation time, QEMU could just use a mapping depending on max_cpus: < 128 VCPUs: use the current mapping 128 <= x < 1020: use a more compressed mappingquoted
= 1020: map the redistributor somewhere above 4 GBAs the device tree binding for GICv3 just supports a stride value, we don't have any other real options beside this, right? So how I see this, a contiguous mapping (with possible holes) is the only way.Not really. The GICv3 binding definitely supports having several regions for the redistributors (see the binding documentation). This allows for the pathological case where you have N regions for N CPUs. Not that we ever want to go there, really.Ah yes, thanks for pointing that out. I was mixing this up with the stride parameter, which is independent of this. Sorry for that. So from a userland point of view we probably would like to have the first n VCPU's redistributors mapped at their current places and allow for more VCPUs to use memory above 4 GB. Which would require quite some changes to the code to support this in a very flexible way. I think this could be much easier if we confine ourselves to two regions (one contiguous lower (< 4 GB) and one contiguous upper region (>4 GB)), so we don't need to support arbitrary per VCPU addresses, but could just use the 1st or 2nd map depending on the VCPU number. Is this too hackish? If not, I would add another vgic_addr type (like KVM_VGIC_V3_ADDR_TYPE_REDIST_UPPER or so) to be used from userland and use that in the handle_mmio region detection. Let me know if that sounds reasonable.The point that I've been trying to make sure we think about is if we'll regret not being able to fragment the redistributor regions a bit. Even if it's technically possible, we may regret requiring a huge contigous allocation in the guest physical address space. But maybe we don't care when we have 40 bits to play with?
40 bits are more than enough. But are we OK with using only memory above 4GB? Is there some code before the Linux kernel that is limited to 4GB? I am thinking about 32bit guests in particular, which may have some firmware blob executed before which may not use the MMU. If this is not an issue, I'd rather stay with one contiguous region - at least for the itme being. The current GICv3 code has a limit of 255 VCPUs anyway, so this requires at most 32MB, which should be easily fitted anywhere. Should we later need to extend the number of VCPUs, we can in the worst case adjust the code to support split regions if the 4GB limit issue persists. This would be done via a new KVM capability and some new register groups in the KVM device ioctl to set a second (or following) region, so in a backwards compatible way. Cheers, Andre.