Thread (135 messages) 135 messages, 5 authors, 2019-02-14

Re: [PATCH 00/19] KVM: PPC: Book3S HV: add XIVE native exploitation mode

From: Cédric Le Goater <clg@kaod.org>
Date: 2019-01-29 14:29:55
Also in: kvm

quoted
quoted
Another general comment is that you seem to have written all this
code assuming we are using HV KVM in a host running bare-metal.
Yes. I didn't look at the other configurations. I thought that we could
use the kernel_irqchip=off option to begin with. A couple of checks
are indeed missing.
Using kernel_irqchip=off would mean that we would not be able to use
the in-kernel XICS emulation, which would have a performance impact.
yes. But it is not supported today. Correct ? 
We need an explicit capability for XIVE exploitation that can be
enabled or disabled on the qemu command line, so that we can enforce a
uniform set of capabilities across all the hosts in a migration
domain.  And it's no good to say we have the capability when all
attempts to use it will fail.  Therefore the kernel needs to say that
it doesn't have the capability in a PR KVM guest or in a nested HV
guest.
OK. I will work on adding a KVM_CAP_PPC_NESTED_IRQ_HV capability 
for future use.
quoted
quoted
However, we could be using PR KVM (either in a bare-metal host or in a
guest), or we could be doing nested HV KVM where we are using the
kvm_hv module inside a KVM guest and using special hypercalls for
controlling our guests.
Yes. 

It would be good to talk a little about the nested support (offline 
may be) to make sure that we are not missing some major interface that 
would require a lot of change. If we need to prepare ground, I think
the timing is good.

The size of the IRQ number space might be a problem. It seems we 
would need to increase it considerably to support multiple nested 
guests. That said I haven't look much how nested is designed.  
The current design of nested HV is that the entire non-volatile state
of all the nested guests is encapsulated within the state and
resources of the L1 hypervisor.  That means that if the L1 hypervisor
gets migrated, all of its guests go across inside it and there is no
extra state that L0 needs to be aware of.  That would imply that the
VP number space for the nested guests would need to come from within
the VP number space for L1; but the amount of VP space we allocate to
each guest doesn't seem to be large enough for that to be practical.
If the KVM XIVE device had some information on the max number of CPUs 
provisioned for the guest, we could optimize the VP allocation.

That might be a larger KVM topic though. There are some static limits 
on the number of CPUs in QEMU and in KVM, which have no relation AFAICT. 

Thanks,

C.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help