Thread (3 messages) 3 messages, 3 authors, 2016-09-01

[RFC PATCH 6/7] arm64: KVM: Handle trappable TLB instructions

From: Punit Agrawal <hidden>
Date: 2016-08-26 09:40:07
Also in: kvm, kvmarm, lkml

Punit Agrawal [off-list ref] writes:
Will Deacon [off-list ref] writes:
quoted
Hi Punit,

On Tue, Aug 16, 2016 at 11:45:11AM +0100, Punit Agrawal wrote:
quoted
The ARMv8 architecture allows trapping of TLB maintenane instructions
from EL0/EL1 to higher exception levels. On encountering a trappable TLB
instruction in a guest, an exception is taken to EL2.

Add functionality to handle emulating the TLB instructions.

Signed-off-by: Punit Agrawal <redacted>
Cc: Christoffer Dall <redacted>
Cc: Marc Zyngier <redacted>
[...]
quoted
+void __hyp_text
+__kvm_emulate_tlb_invalidate(struct kvm *kvm, u32 sys_op, u64 regval)
+{
+	kvm = kern_hyp_va(kvm);
+
+	/*
+	 * Switch to the guest before performing any TLB operations to
+	 * target the appropriate VMID
+	 */
+	__switch_to_guest_regime(kvm);
+
+	/*
+	 *  TLB maintenance operations broadcast to inner-shareable
+	 *  domain when HCR_FB is set (default for KVM).
+	 */
+	switch (sys_op) {
+	case TLBIALL:
+	case TLBIALLIS:
+	case ITLBIALL:
+	case DTLBIALL:
+	case TLBI_VMALLE1:
+	case TLBI_VMALLE1IS:
+		__tlbi(vmalle1is);
+		break;
+	case TLBIMVA:
+	case TLBIMVAIS:
+	case ITLBIMVA:
+	case DTLBIMVA:
+	case TLBI_VAE1:
+	case TLBI_VAE1IS:
+		__tlbi(vae1is, regval);
I'm pretty nervous about this. Although you've switched in the guest stage-2
page table before the TLB maintenance, we're still running on a host stage-1
and it's not clear to me that the stage-1 context is completely ignored for
the purposes of a stage-1 TLBI executed at EL2.

For example, if TCR_EL1.TBI0 is set in the guest but cleared in the host,
my reading of the architecture is that it will be treated as zero when
we perform this invalidation operation. I worry that we have similar
problems with the granule size, where bits become RES0 in the TLBI VA
ops.
Some control bits seem to be explicitly called out to not affect TLB
maintenance operations[0] but I hadn't considered the ones you highlight.

[0] ARMv8 ARM DDI 0487A.j D4.7, Pg D4-1814
quoted
Finally, we should probably be masking out the RES0 bits in the TLBI
ops, just in case some future extension to the architecture defines them
in such a way where they have different meanings when executed at EL2
or EL1.
Although, the RES0 bits for TLBI VA ops are currently ignored, I agree
that masking them out based on granule size protects against future
incompatible changes.
quoted
The easiest thing to do is just TLBI VMALLE1IS for all trapped operations,
but you might want to see how that performs.
That sounds reasonable for correctness. But I suspect we'll have to do
more to claw back some performance. Let me run a few tests and come back
on this.
Assuming I've correctly switched in TCR and replacing the various TLB
operations in this patch with TLBI VMALLE1IS, there is a drop in kernel
build times of ~5% (384s vs 363s).

For the next version, I'll use this as a starting point and try clawing
back the loss by using the appropriate TLB instructions albeit with
additional sanity checking based on context.
Thanks for having a look.

Punit
quoted
Will
_______________________________________________
kvmarm mailing list
kvmarm at lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
_______________________________________________
kvmarm mailing list
kvmarm at lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help