Thread (18 messages) 18 messages, 3 authors, 2013-07-11

Re: [PATCH 2/2] KVM: PPC: Book3E: Get vcpu's last instruction for emulation

From: Scott Wood <hidden>
Date: 2013-07-09 17:13:51
Also in: kvm

On 07/08/2013 08:39:05 AM, Alexander Graf wrote:
=20
On 28.06.2013, at 11:20, Mihai Caraman wrote:
=20
quoted
lwepx faults needs to be handled by KVM and this implies additional =20
code
quoted
in DO_KVM macro to identify the source of the exception originated =20
from
quoted
host context. This requires to check the Exception Syndrome Register
(ESR[EPID]) and External PID Load Context Register (EPLC[EGS]) for =20
DTB_MISS,
quoted
DSI and LRAT exceptions which is too intrusive for the host.

Get rid of lwepx and acquire last instuction in =20
kvmppc_handle_exit() by
quoted
searching for the physical address and kmap it. This fixes an =20
infinite loop
=20
What's the difference in speed for this?
=20
Also, could we call lwepx later in host code, when =20
kvmppc_get_last_inst() gets invoked?
Any use of lwepx is problematic unless we want to add overhead to the =20
main Linux TLB miss handler.
quoted
+		return;
+	}
+
+	mas3 =3D mfspr(SPRN_MAS3);
+	pr =3D vcpu->arch.shared->msr & MSR_PR;
+	if ((pr && (!(mas3 & MAS3_UX))) || ((!pr) && (!(mas3 & =20
MAS3_SX)))) {
quoted
+	 	/*
+		 * Another thread may rewrite the TLB entry in =20
parallel, don't
quoted
+		 * execute from the address if the execute permission =20
is not set
=20
Isn't this racy?
Yes, that's the point.  We want to access permissions atomically with =20
the address.  If the guest races here, the unpredictable behavior is =20
its own fault, but we don't want to make it worse by assuming that the =20
new TLB entry is executable just because the old TLB entry was.

There's still a potential problem if the instruction at the new TLB =20
entry is valid but not something that KVM emulates (because it wouldn't =20
have trapped).  Given that the guest is already engaging in =20
unpredictable behavior, though, and that it's no longer a security =20
issue (it'll just cause the guest to exit), I don't think we need to =20
worry too much about it.
quoted
+		 */
+		vcpu->arch.fault_esr =3D 0;
+		*exit_nr =3D BOOKE_INTERRUPT_INST_STORAGE;
+		return;
+	}
+
+	/* Get page size */
+	if (MAS0_GET_TLBSEL(mfspr(SPRN_MAS0)) =3D=3D 0)
+		psize_shift =3D PAGE_SHIFT;
+	else
+		psize_shift =3D MAS1_GET_TSIZE(mas1) + 10;
+
+	mas7_mas3 =3D (((u64) mfspr(SPRN_MAS7)) << 32) |
+		    mfspr(SPRN_MAS3);
=20
You're non-atomically reading MAS3/MAS7 after you've checked for =20
permissions on MAS3. I'm surprised there's no handler that allows =20
MAS3/7 access through the new, combined SPR for 64bit systems.
There is, but then we'd need to special-case 64-bit systems.  Why does =20
atomicity matter here?  The MAS registers were filled in when we did =20
the tlbsx.  They are thread-local.  They don't magically change just =20
because the other thread rewrites the TLB entry that was used to fill =20
them.
quoted
+	addr =3D (mas7_mas3 & (~0ULL << psize_shift)) |
+	       (geaddr & ((1ULL << psize_shift) - 1ULL));
+
+	/* Map a page and get guest's instruction */
+	page =3D pfn_to_page(addr >> PAGE_SHIFT);
=20
So it seems to me like you're jumping through a lot of hoops to make =20
sure this works for LRAT and non-LRAT at the same time. Can't we just =20
treat them as the different things they are?
=20
What if we have different MMU backends for LRAT and non-LRAT? The =20
non-LRAT case could then try lwepx, if that fails, fall back to read =20
the shadow TLB. For the LRAT case, we'd do lwepx, if that fails fall =20
back to this logic.
This isn't about LRAT; it's about hardware threads.  It also fixes the =20
handling of execute-only pages on current chips.

-Scott=
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help