[RFC 04/11] KVM, arm, arm64: Offer PAs to IPAs idmapping to internal VMs

From: Florent Revest <hidden>
Date: 2017-09-26 21:14:52
Also in: kvm, kvmarm, linux-efi, lkml

On Thu, 2017-08-31 at 11:23 +0200, Christoffer Dall wrote:

quoted

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 2ea21da..1d2d3df 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c

@@ -772,6 +772,11 @@ static void stage2_unmap_memslot(struct kvm

*kvm,
????????phys_addr_t size = PAGE_SIZE * memslot->npages;
????????hva_t reg_end = hva + size;

+???????if (unlikely(!kvm->mm)) {

I think you should consider using a predicate so that it's clear that
this is for in-kernel VMs and not just some random situation where mm
can be NULL.

Internal VMs should be the only usage when kvm->mm would be NULL.
However if you'd prefer it otherwise, I'll make sure this condition
will be made clearer.

So it's unclear to me why we don't need any special casing in
kvm_handle_guest_abort, related to MMIO exits etc.??You probably
assume that we will never do emulation, but that should be described
and addressed somewhere before I can critically review this patch.

This is indeed what I was assuming. This RFC does not allow MMIO with
internal VMs. I can not think of a usage when this would be useful. I'd
make sure this would be documented in an eventual later RFC.

quoted

+static int internal_vm_prep_mem(struct kvm *kvm,
+???????????????????????????????const struct
kvm_userspace_memory_region *mem)
+{
+???????phys_addr_t addr, end;
+???????unsigned long pfn;
+???????int ret;
+???????struct kvm_mmu_memory_cache cache = { 0 };
+
+???????end = mem->guest_phys_addr + mem->memory_size;
+???????pfn = __phys_to_pfn(mem->guest_phys_addr);
+???????addr = mem->guest_phys_addr;

My main concern here is that we don't do any checks on this region
and we could be mapping device memory here as well.??Are we intending
that to be ok, and are we then relying on the guest to use proper
memory attributes ?

Indeed, being able to map device memory is intended. It is needed for
Runtime Services sandboxing. It also relies on the guest being
correctly configured.

quoted

+
+???????for (; addr < end; addr += PAGE_SIZE) {
+???????????????pte_t pte = pfn_pte(pfn, PAGE_S2);
+
+???????????????pte = kvm_s2pte_mkwrite(pte);
+
+???????????????ret = mmu_topup_memory_cache(&cache,
+????????????????????????????????????????????KVM_MMU_CACHE_MIN_PAGE
S,
+????????????????????????????????????????????KVM_NR_MEM_OBJS);

You should be able to allocate all you need up front instead of doing
it in sequences.

Ok.

quoted

+???????????????if (ret) {
+???????????????????????mmu_free_memory_cache(&cache);
+???????????????????????return ret;
+???????????????}
+???????????????spin_lock(&kvm->mmu_lock);
+???????????????ret = stage2_set_pte(kvm, &cache, addr, &pte, 0);
+???????????????spin_unlock(&kvm->mmu_lock);

Since you're likely to allocate some large contiguous chunks here,
can you have a look at using section mappings?

Will do.

Thank you very much,
? ? Florent

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help