Thread (62 messages) 62 messages, 5 authors, 2021-08-01

Re: [PATCH 07/16] KVM: arm64: Wire MMIO guard hypercalls

From: Marc Zyngier <maz@kernel.org>
Date: 2021-08-01 11:21:02
Also in: kvm, kvmarm, lkml

On Fri, 30 Jul 2021 14:11:03 +0100,
Will Deacon [off-list ref] wrote:
On Wed, Jul 28, 2021 at 11:47:20AM +0100, Marc Zyngier wrote:
quoted
On Tue, 27 Jul 2021 19:11:46 +0100,
Will Deacon [off-list ref] wrote:
quoted
On Thu, Jul 15, 2021 at 05:31:50PM +0100, Marc Zyngier wrote:
quoted
Plumb in the hypercall interface to allow a guest to discover,
enroll, map and unmap MMIO regions.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/kvm/hypercalls.c | 20 ++++++++++++++++++++
 include/linux/arm-smccc.h   | 28 ++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+)
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index 30da78f72b3b..a3deeb907fdd 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -5,6 +5,7 @@
 #include <linux/kvm_host.h>
 
 #include <asm/kvm_emulate.h>
+#include <asm/kvm_mmu.h>
 
 #include <kvm/arm_hypercalls.h>
 #include <kvm/arm_psci.h>
@@ -129,10 +130,29 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 	case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID:
 		val[0] = BIT(ARM_SMCCC_KVM_FUNC_FEATURES);
 		val[0] |= BIT(ARM_SMCCC_KVM_FUNC_PTP);
+		val[0] |= BIT(ARM_SMCCC_KVM_FUNC_MMIO_GUARD_INFO);
+		val[0] |= BIT(ARM_SMCCC_KVM_FUNC_MMIO_GUARD_ENROLL);
+		val[0] |= BIT(ARM_SMCCC_KVM_FUNC_MMIO_GUARD_MAP);
+		val[0] |= BIT(ARM_SMCCC_KVM_FUNC_MMIO_GUARD_UNMAP);
 		break;
 	case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
 		kvm_ptp_get_time(vcpu, val);
 		break;
+	case ARM_SMCCC_VENDOR_HYP_KVM_MMIO_GUARD_INFO_FUNC_ID:
+		val[0] = PAGE_SIZE;
+		break;
I get the nagging feeling that querying the stage-2 page-size outside of
MMIO guard is going to be useful once we start looking at memory sharing,
so perhaps rename this to something more generic?
At this stage, why not follow the architecture and simply expose it as
ID_AA64MMFR0_EL1.TGran{4,64,16}_2? That's exactly what it is for, and
we already check for this in KVM itself.
Nice, I hadn't thought of that. On reflection, though, I don't agree that
it's "exactly what it is for" -- the ID register talks about the supported
stage-2 page-sizes, whereas we want to advertise the one page size that
we're currently using. In other words, it's important that we only ever
populate one of the fields and I wonder if that could bite us in future
somehow?
Either that, or we expose all the page sizes >= to that of the host
(using the fact that larger page sizes are multiples of the base one),
and use the guest's page size to work out the granularity. Which is
what NV does already.
Up to you, you've definitely got a better feel for this than me.
I'll have a look. The "one size" version is dead easy.
quoted
quoted
quoted
+	case ARM_SMCCC_VENDOR_HYP_KVM_MMIO_GUARD_ENROLL_FUNC_ID:
+		set_bit(KVM_ARCH_FLAG_MMIO_GUARD, &vcpu->kvm->arch.flags);
+		val[0] = SMCCC_RET_SUCCESS;
+		break;
+	case ARM_SMCCC_VENDOR_HYP_KVM_MMIO_GUARD_MAP_FUNC_ID:
+		if (kvm_install_ioguard_page(vcpu, vcpu_get_reg(vcpu, 1)))
+			val[0] = SMCCC_RET_SUCCESS;
+		break;
+	case ARM_SMCCC_VENDOR_HYP_KVM_MMIO_GUARD_UNMAP_FUNC_ID:
+		if (kvm_remove_ioguard_page(vcpu, vcpu_get_reg(vcpu, 1)))
+			val[0] = SMCCC_RET_SUCCESS;
+		break;
I think there's a slight discrepancy between MAP and UNMAP here in that
calling UNMAP on something that hasn't been mapped will fail, whereas
calling MAP on something that's already been mapped will succeed. I think
that might mean you can't reason about the final state of the page if two
vCPUs race to call these functions in some cases (and both succeed).
I'm not sure that's the expected behaviour for ioremap(), for example
(you can ioremap two portions of the same page successfully).
Hmm, good point. Does that mean we should be refcounting the stage-2?
Otherwise if we do something like:

	foo = ioremap(page, 0x100);
	bar = ioremap(page+0x100, 0x100);
	iounmap(foo);

then bar will break. Or did I miss something in the series?
No, I don't think you have. But I don't think we should implement this
refcounting in the hypervisor side. We really should do it guest side.

I'll have a look.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help