Re: [PATCH 2/2] KVM: PPC: Book3E: Get vcpu's last instruction for emulation

From: Alexander Graf <hidden>
Date: 2013-07-09 21:44:31
Also in: kvm

On 09.07.2013, at 20:46, Scott Wood wrote:

On 07/09/2013 12:44:32 PM, Alexander Graf wrote:

quoted

On 07/09/2013 07:13 PM, Scott Wood wrote:

quoted

On 07/08/2013 08:39:05 AM, Alexander Graf wrote:

quoted

On 28.06.2013, at 11:20, Mihai Caraman wrote:

quoted

lwepx faults needs to be handled by KVM and this implies =

additional code

quoted

in DO_KVM macro to identify the source of the exception =

originated from

quoted

host context. This requires to check the Exception Syndrome =

Register

quoted

(ESR[EPID]) and External PID Load Context Register (EPLC[EGS]) =

for DTB_MISS,

quoted

DSI and LRAT exceptions which is too intrusive for the host.

Get rid of lwepx and acquire last instuction in =

kvmppc_handle_exit() by

quoted

searching for the physical address and kmap it. This fixes an =

infinite loop

quoted

What's the difference in speed for this?
Also, could we call lwepx later in host code, when =

kvmppc_get_last_inst() gets invoked?

quoted

Any use of lwepx is problematic unless we want to add overhead to =

the main Linux TLB miss handler.

quoted

What exactly would be missing?

=20
If lwepx faults, it goes to the normal host TLB miss handler.  Without =

adding code to it to recognize that it's an external-PID fault, it will =
try to search the normal Linux page tables and insert a normal host =
entry.  If it thinks it has succeeded, it will retry the instruction =
rather than search for an exception handler.  The instruction will fault =
again, and you get a hang.

:(

So we either have to rewrite IVOR / IVPR or add a branch in the hot TLB =
miss interrupt handler. Both alternatives suck.

=20

quoted

I'd also still like to see some performance benchmarks on this to =

make sure we're not walking into a bad direction.

=20
I doubt it'll be significantly different.  There's overhead involved =

in setting up for lwepx as well.  It doesn't hurt to test, though this =
is a functional correctness issue, so I'm not sure what better =
alternatives we have.  I don't want to slow down non-KVM TLB misses for =
this.

Yeah, I concur on that part. It probably won't get better. Sigh.

=20

quoted

+    addr =3D (mas7_mas3 & (~0ULL << psize_shift)) |
+           (geaddr & ((1ULL << psize_shift) - 1ULL));
+
+    /* Map a page and get guest's instruction */
+    page =3D pfn_to_page(addr >> PAGE_SHIFT);

So it seems to me like you're jumping through a lot of hoops to =

make sure this works for LRAT and non-LRAT at the same time. Can't we =
just treat them as the different things they are?

quoted

What if we have different MMU backends for LRAT and non-LRAT? The =

non-LRAT case could then try lwepx, if that fails, fall back to read the =
shadow TLB. For the LRAT case, we'd do lwepx, if that fails fall back to =
this logic.

quoted

This isn't about LRAT; it's about hardware threads.  It also fixes =

the handling of execute-only pages on current chips.

quoted

On non-LRAT systems we could always check our shadow copy of the =

guest's TLB, no? I'd really like to know what the performance difference =
would be for the 2 approaches.

=20
I suspect that tlbsx is faster, or at worst similar.  And unlike =

comparing tlbsx to lwepx (not counting a fix for the threading problem), =
we don't already have code to search the guest TLB, so testing would be =
more work.

We have code to walk the guest TLB for TLB misses. This really is just =
the TLB miss search without host TLB injection.

So let's say we're using the shadow TLB. The guest always has its say 64 =
TLB entries that it can count on - we never evict anything by accident, =
because we store all of the 64 entries in our guest TLB cache. When the =
guest faults at an address, the first thing we do is we check the cache =
whether we have that page already mapped.

However, with this method we now have 2 enumeration methods for guest =
TLB searches. We have the tlbsx one which searches the host TLB and we =
have our guest TLB cache. The guest TLB cache might still contain an =
entry for an address that we already invalidated on the host. Would that =
impose a problem?

I guess not because we're swizzling the exit code around to instead be =
an instruction miss which means we restore the TLB entry into our host's =
TLB so that when we resume, we land here and the tlbsx hits. But it =
feels backwards.

At least this code has to become something more generic, such as =
kvmppc_read_guest(vcpu, addr, TYPE_INSN) and move into the host mmu =
implementation, as it's 100% host mmu specific.


Alex

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help