Thread (21 messages) 21 messages, 5 authors, 2021-06-08

Re: [PATCH] KVM: X86: fix tlb_flush_guest()

From: Sean Christopherson <seanjc@google.com>
Date: 2021-06-02 22:08:46
Also in: lkml

On Wed, Jun 02, 2021, Sean Christopherson wrote:
On Fri, May 28, 2021, Lai Jiangshan wrote:
quoted

On 2021/5/28 03:28, Sean Christopherson wrote:
quoted
On Thu, May 27, 2021, Sean Christopherson wrote:
quoted
quoted
KVM_REQ_MMU_RELOAD is overkill, nuking the shadow page tables will completely
offset the performance gains of the paravirtualized flush.
Argh, I take that back.  The PV KVM_VCPU_FLUSH_TLB flag doesn't distinguish
between flushing a specific mm and flushing the entire TLB.  The HyperV usage
(via KVM_REQ) also throws everything into a single bucket.  A full RELOAD still
isn't necessary as KVM just needs to sync all roots, not blast them away.  For
previous roots, KVM doesn't have a mechanism to defer the sync, so the immediate
fix will need to unload those roots.

And looking at KVM's other flows, __kvm_mmu_new_pgd() and kvm_set_cr3() are also
broken with respect to previous roots.  E.g. if the guest does a MOV CR3 that
flushes the entire TLB, followed by a MOV CR3 with PCID_NOFLUSH=1, KVM will fail
to sync the MMU on the second flush even though the guest can technically rely
on the first MOV CR3 to have synchronized any previous changes relative to the
fisrt MOV CR3.
Could you elaborate the problem please?
When can a MOV CR3 that needs to flush the entire TLB if PCID is enabled?
Scratch that, I was wrong.  The SDM explicitly states that other PCIDs don't
need to be flushed if CR4.PCIDE=1.
*sigh*

I was partially right.  If the guest does

  1: MOV    B, %rax
     MOV %rax, %cr3

  2: <modify PTEs in B>

  3: MOV    A, %rax
     MOV %rax, %cr3
 
  4: MOV    B, %rax
     BTS  $63, %rax
     MOV %rax, %cr3

where A and B are CR3 values with the same PCID, then KVM will fail to sync B at
step (4) due to PCID_NOFLUSH, even though the guest can technically rely on
its modifications at step (2) to become visible at step (3) when the PCID is
flushed on CR3 load.

So it's not a full TLB flush, rather a flush of the PCID, which can theoretically
impact previous CR3 values.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help