Re: [PATCH 2/2] KVM: PPC: Book3E: Get vcpu's last instruction for emulation

From: Alexander Graf <hidden>
Date: 2013-07-10 22:50:07
Also in: kvm

On 10.07.2013, at 20:42, Scott Wood wrote:

On 07/10/2013 05:15:09 AM, Alexander Graf wrote:

quoted

On 10.07.2013, at 02:06, Scott Wood wrote:

quoted

On 07/09/2013 04:44:24 PM, Alexander Graf wrote:

quoted

On 09.07.2013, at 20:46, Scott Wood wrote:

quoted

I suspect that tlbsx is faster, or at worst similar.  And unlike =

comparing tlbsx to lwepx (not counting a fix for the threading problem), =
we don't already have code to search the guest TLB, so testing would be =
more work.

quoted

We have code to walk the guest TLB for TLB misses. This really is =

just the TLB miss search without host TLB injection.

quoted

So let's say we're using the shadow TLB. The guest always has its =

say 64 TLB entries that it can count on - we never evict anything by =
accident, because we store all of the 64 entries in our guest TLB cache. =
When the guest faults at an address, the first thing we do is we check =
the cache whether we have that page already mapped.

quoted

However, with this method we now have 2 enumeration methods for =

guest TLB searches. We have the tlbsx one which searches the host TLB =
and we have our guest TLB cache. The guest TLB cache might still contain =
an entry for an address that we already invalidated on the host. Would =
that impose a problem?

quoted

I guess not because we're swizzling the exit code around to =

instead be an instruction miss which means we restore the TLB entry into =
our host's TLB so that when we resume, we land here and the tlbsx hits. =
But it feels backwards.

quoted

Any better way?  Searching the guest TLB won't work for the LRAT =

case, so we'd need to have this logic around anyway.  We shouldn't add a =
second codepath unless it's a clear performance gain -- and again, I =
suspect it would be the opposite, especially if the entry is not in TLB0 =
or in one of the first few entries searched in TLB1.  The tlbsx miss =
case is not what we should optimize for.

quoted

Hrm.
So let's redesign this thing theoretically. We would have an exit =

that requires an instruction fetch. We would override =
kvmppc_get_last_inst() to always do kvmppc_ld_inst(). That one can fail =
because it can't find the TLB entry in the host TLB. When it fails, we =
have to abort the emulation and resume the guest at the same IP.

quoted

Now the guest gets the TLB miss, we populate, go back into the guest. =

The guest hits the emulation failure again. We go back to =
kvmppc_ld_inst() which succeeds this time and we can emulate the =
instruction.

=20
That's pretty much what this patch does, except that it goes =

immediately to the TLB miss code rather than having the extra round-trip =
back to the guest.  Is there any benefit from adding that extra =
round-trip?  Rewriting the exit type instead doesn't seem that bad...

It's pretty bad. I want to have code that is easy to follow - and I =
don't care whether the very rare case of a TLB entry getting evicted by =
a random other thread right when we execute the exit path is slower by a =
few percent if we get cleaner code for that.

=20

quoted

I think this works. Just make sure that the gateway to the =

instruction fetch is kvmppc_get_last_inst() and make that failable. Then =
the difference between looking for the TLB entry in the host's TLB or in =
the guest's TLB cache is hopefully negligible.

=20
I don't follow here.  What does this have to do with looking in the =

guest TLB?

I want to hide the fact that we're cheating as much as possible, that's =
it.


Alex

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help