Re: [PATCH 2/2] KVM: PPC: Book3E: Get vcpu's last instruction for emulation
From: Alexander Graf <hidden>
Date: 2013-07-09 21:44:31
Also in:
kvm
On 09.07.2013, at 20:46, Scott Wood wrote:
On 07/09/2013 12:44:32 PM, Alexander Graf wrote:quoted
On 07/09/2013 07:13 PM, Scott Wood wrote:quoted
On 07/08/2013 08:39:05 AM, Alexander Graf wrote:quoted
On 28.06.2013, at 11:20, Mihai Caraman wrote:quoted
lwepx faults needs to be handled by KVM and this implies =
additional code
quoted
quoted
quoted
quoted
in DO_KVM macro to identify the source of the exception =
originated from
quoted
quoted
quoted
quoted
host context. This requires to check the Exception Syndrome =
Register
quoted
quoted
quoted
quoted
(ESR[EPID]) and External PID Load Context Register (EPLC[EGS]) =
for DTB_MISS,
quoted
quoted
quoted
quoted
DSI and LRAT exceptions which is too intrusive for the host. Get rid of lwepx and acquire last instuction in =
kvmppc_handle_exit() by
quoted
quoted
quoted
quoted
searching for the physical address and kmap it. This fixes an =
infinite loop
quoted
quoted
quoted
What's the difference in speed for this? Also, could we call lwepx later in host code, when =
kvmppc_get_last_inst() gets invoked?
quoted
quoted
Any use of lwepx is problematic unless we want to add overhead to =
the main Linux TLB miss handler.
quoted
What exactly would be missing?=20 If lwepx faults, it goes to the normal host TLB miss handler. Without =
adding code to it to recognize that it's an external-PID fault, it will = try to search the normal Linux page tables and insert a normal host = entry. If it thinks it has succeeded, it will retry the instruction = rather than search for an exception handler. The instruction will fault = again, and you get a hang. :( So we either have to rewrite IVOR / IVPR or add a branch in the hot TLB = miss interrupt handler. Both alternatives suck.
=20quoted
I'd also still like to see some performance benchmarks on this to =
make sure we're not walking into a bad direction.
=20 I doubt it'll be significantly different. There's overhead involved =
in setting up for lwepx as well. It doesn't hurt to test, though this = is a functional correctness issue, so I'm not sure what better = alternatives we have. I don't want to slow down non-KVM TLB misses for = this. Yeah, I concur on that part. It probably won't get better. Sigh.
=20quoted
quoted
quoted
quoted
+ addr =3D (mas7_mas3 & (~0ULL << psize_shift)) | + (geaddr & ((1ULL << psize_shift) - 1ULL)); + + /* Map a page and get guest's instruction */ + page =3D pfn_to_page(addr >> PAGE_SHIFT);So it seems to me like you're jumping through a lot of hoops to =
make sure this works for LRAT and non-LRAT at the same time. Can't we = just treat them as the different things they are?
quoted
quoted
quoted
What if we have different MMU backends for LRAT and non-LRAT? The =
non-LRAT case could then try lwepx, if that fails, fall back to read the = shadow TLB. For the LRAT case, we'd do lwepx, if that fails fall back to = this logic.
quoted
quoted
This isn't about LRAT; it's about hardware threads. It also fixes =
the handling of execute-only pages on current chips.
quoted
On non-LRAT systems we could always check our shadow copy of the =
guest's TLB, no? I'd really like to know what the performance difference = would be for the 2 approaches.
=20 I suspect that tlbsx is faster, or at worst similar. And unlike =
comparing tlbsx to lwepx (not counting a fix for the threading problem), = we don't already have code to search the guest TLB, so testing would be = more work. We have code to walk the guest TLB for TLB misses. This really is just = the TLB miss search without host TLB injection. So let's say we're using the shadow TLB. The guest always has its say 64 = TLB entries that it can count on - we never evict anything by accident, = because we store all of the 64 entries in our guest TLB cache. When the = guest faults at an address, the first thing we do is we check the cache = whether we have that page already mapped. However, with this method we now have 2 enumeration methods for guest = TLB searches. We have the tlbsx one which searches the host TLB and we = have our guest TLB cache. The guest TLB cache might still contain an = entry for an address that we already invalidated on the host. Would that = impose a problem? I guess not because we're swizzling the exit code around to instead be = an instruction miss which means we restore the TLB entry into our host's = TLB so that when we resume, we land here and the tlbsx hits. But it = feels backwards. At least this code has to become something more generic, such as = kvmppc_read_guest(vcpu, addr, TYPE_INSN) and move into the host mmu = implementation, as it's 100% host mmu specific. Alex