Thread (18 messages) 18 messages, 3 authors, 2013-07-11

Re: [PATCH 2/2] KVM: PPC: Book3E: Get vcpu's last instruction for emulation

From: Alexander Graf <hidden>
Date: 2013-07-09 21:44:31
Also in: kvm

On 09.07.2013, at 20:46, Scott Wood wrote:
On 07/09/2013 12:44:32 PM, Alexander Graf wrote:
quoted
On 07/09/2013 07:13 PM, Scott Wood wrote:
quoted
On 07/08/2013 08:39:05 AM, Alexander Graf wrote:
quoted
On 28.06.2013, at 11:20, Mihai Caraman wrote:
quoted
lwepx faults needs to be handled by KVM and this implies =
additional code
quoted
quoted
quoted
quoted
in DO_KVM macro to identify the source of the exception =
originated from
quoted
quoted
quoted
quoted
host context. This requires to check the Exception Syndrome =
Register
quoted
quoted
quoted
quoted
(ESR[EPID]) and External PID Load Context Register (EPLC[EGS]) =
for DTB_MISS,
quoted
quoted
quoted
quoted
DSI and LRAT exceptions which is too intrusive for the host.

Get rid of lwepx and acquire last instuction in =
kvmppc_handle_exit() by
quoted
quoted
quoted
quoted
searching for the physical address and kmap it. This fixes an =
infinite loop
quoted
quoted
quoted
What's the difference in speed for this?
Also, could we call lwepx later in host code, when =
kvmppc_get_last_inst() gets invoked?
quoted
quoted
Any use of lwepx is problematic unless we want to add overhead to =
the main Linux TLB miss handler.
quoted
What exactly would be missing?
=20
If lwepx faults, it goes to the normal host TLB miss handler.  Without =
adding code to it to recognize that it's an external-PID fault, it will =
try to search the normal Linux page tables and insert a normal host =
entry.  If it thinks it has succeeded, it will retry the instruction =
rather than search for an exception handler.  The instruction will fault =
again, and you get a hang.

:(

So we either have to rewrite IVOR / IVPR or add a branch in the hot TLB =
miss interrupt handler. Both alternatives suck.
=20
quoted
I'd also still like to see some performance benchmarks on this to =
make sure we're not walking into a bad direction.
=20
I doubt it'll be significantly different.  There's overhead involved =
in setting up for lwepx as well.  It doesn't hurt to test, though this =
is a functional correctness issue, so I'm not sure what better =
alternatives we have.  I don't want to slow down non-KVM TLB misses for =
this.

Yeah, I concur on that part. It probably won't get better. Sigh.
=20
quoted
quoted
quoted
quoted
+    addr =3D (mas7_mas3 & (~0ULL << psize_shift)) |
+           (geaddr & ((1ULL << psize_shift) - 1ULL));
+
+    /* Map a page and get guest's instruction */
+    page =3D pfn_to_page(addr >> PAGE_SHIFT);
So it seems to me like you're jumping through a lot of hoops to =
make sure this works for LRAT and non-LRAT at the same time. Can't we =
just treat them as the different things they are?
quoted
quoted
quoted
What if we have different MMU backends for LRAT and non-LRAT? The =
non-LRAT case could then try lwepx, if that fails, fall back to read the =
shadow TLB. For the LRAT case, we'd do lwepx, if that fails fall back to =
this logic.
quoted
quoted
This isn't about LRAT; it's about hardware threads.  It also fixes =
the handling of execute-only pages on current chips.
quoted
On non-LRAT systems we could always check our shadow copy of the =
guest's TLB, no? I'd really like to know what the performance difference =
would be for the 2 approaches.
=20
I suspect that tlbsx is faster, or at worst similar.  And unlike =
comparing tlbsx to lwepx (not counting a fix for the threading problem), =
we don't already have code to search the guest TLB, so testing would be =
more work.

We have code to walk the guest TLB for TLB misses. This really is just =
the TLB miss search without host TLB injection.

So let's say we're using the shadow TLB. The guest always has its say 64 =
TLB entries that it can count on - we never evict anything by accident, =
because we store all of the 64 entries in our guest TLB cache. When the =
guest faults at an address, the first thing we do is we check the cache =
whether we have that page already mapped.

However, with this method we now have 2 enumeration methods for guest =
TLB searches. We have the tlbsx one which searches the host TLB and we =
have our guest TLB cache. The guest TLB cache might still contain an =
entry for an address that we already invalidated on the host. Would that =
impose a problem?

I guess not because we're swizzling the exit code around to instead be =
an instruction miss which means we restore the TLB entry into our host's =
TLB so that when we resume, we land here and the tlbsx hits. But it =
feels backwards.

At least this code has to become something more generic, such as =
kvmppc_read_guest(vcpu, addr, TYPE_INSN) and move into the host mmu =
implementation, as it's 100% host mmu specific.


Alex
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help