Thread (15 messages) 15 messages, 3 authors, 2021-08-05

Re: About two-dimensional page translation (e.g., Intel EPT) and shadow page table in Linux QEMU/KVM

From: harry harry <hidden>
Date: 2021-08-05 19:43:44
Also in: qemu-devel

Sean, understood with many thanks!

Good luck,
Harry

On Wed, Jul 28, 2021 at 3:01 PM Sean Christopherson [off-list ref] wrote:
On Wed, Jul 28, 2021, harry harry wrote:
quoted
Sean, sorry for the late reply. Thanks for your careful explanations.
quoted
For emulation of any instruction/flow that starts with a guest virtual address.
On Intel CPUs, that includes quite literally any "full" instruction emulation,
since KVM needs to translate CS:RIP to a guest physical address in order to fetch
the guest's code stream.  KVM can't avoid "full" emulation unless the guest is
heavily enlightened, e.g. to avoid string I/O, among many other things.
Do you mean the emulated MMU is needed when it *only* wants to
translate GVAs to GPAs in the guest level?
Not quite, though gva_to_gpa() is the main use.  The emulated MMU is also used to
inject guest #PF and to load/store guest PDTPRs.
quoted
In such cases, the hardware MMU cannot be used because hardware MMU
can only translate GVAs to HPAs, right?
Sort of.  The hardware MMU does translate GVA to GPA, but the GPA value is not
visible to software (unless the GPA->HPA translation faults).  That's also true
for VA to PA (and GVA to HPA).  Irrespective of virtualization, x86 ISA doesn't
provide an instruction to retrive the PA for a given VA.

If such an instruction did exist, and it was to be usable for a VMM to do a
GVA->GPA translation, the magic instruction would need to take all MMU params as
operands, e.g. CR0, CR3, CR4, and EFER.  When KVM is active (not the guest), the
hardware MMU is loaded with the host MMU configuration, not the guest.  In both
VMX and SVM, vCPU state is mostly ephemeral in the sense that it ceases to exist
in hardware when the vCPU exits to the host.  Some state is retained in hardware,
e.g. TLB and cache entries, but those are associated with select properties of
the vCPU, e.g. EPTP, CR3, etc..., not with the vCPU itself, i.e. not with the
VMCS (VMX) / VMCB (SVM).
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help