Re: [PATCH v9 4/8] KVM: Use gfn instead of hva for mmu_notifier_retry
From: Chao Peng <hidden>
Date: 2022-11-08 07:21:06
Also in:
kvm, linux-arch, linux-doc, linux-fsdevel, linux-mm, lkml, qemu-devel
On Fri, Nov 04, 2022 at 10:29:48PM +0000, Sean Christopherson wrote:
On Fri, Nov 04, 2022, Chao Peng wrote:quoted
On Thu, Oct 27, 2022 at 11:29:14AM +0100, Fuad Tabba wrote:quoted
Hi, On Tue, Oct 25, 2022 at 4:19 PM Chao Peng [off-list ref] wrote:quoted
Currently in mmu_notifier validate path, hva range is recorded and then checked against in the mmu_notifier_retry_hva() of the page fault path. However, for the to be introduced private memory, a page fault may not have a hva associated, checking gfn(gpa) makes more sense. For existing non private memory case, gfn is expected to continue to work. The only downside is when aliasing multiple gfns to a single hva, the current algorithm of checking multiple ranges could result in a much larger range being rejected. Such aliasing should be uncommon, so the impact is expected small. It also fixes a bug in kvm_zap_gfn_range() which has already been usingnit: Now it's kvm_unmap_gfn_range().Forgot to mention: the bug is still with kvm_zap_gfn_range(). It calls kvm_mmu_invalidate_begin/end with a gfn range but before this series kvm_mmu_invalidate_begin/end actually accept a hva range. Note it's unrelated to whether we use kvm_zap_gfn_range() or kvm_unmap_gfn_range() in the following patch (patch 05).Grr, in the future, if you find an existing bug, please send a patch. At the very least, report the bug.
Agreed, this can be sent out separately from this series.
The APICv case that this was added for could very well be broken because of this, and the resulting failures would be an absolute nightmare to debug.
Given the apicv_inhibit should be rare, the change looks good to me. Just to be clear, your will send out this fix, right? Chao
quoted hunk ↗ jump to hunk
Compile tested only... -- From: Sean Christopherson <seanjc@google.com> Date: Fri, 4 Nov 2022 22:20:33 +0000 Subject: [PATCH] KVM: x86/mmu: Block all page faults during kvm_zap_gfn_range() When zapping a GFN range, pass 0 => ALL_ONES for the to-be-invalidated range to effectively block all page faults while the zap is in-progress. The invalidation helpers take a host virtual address, whereas zapping a GFN obviously provides a guest physical address and with the wrong unit of measurement (frame vs. byte). Alternatively, KVM could walk all memslots to get the associated HVAs, but thanks to SMM, that would require multiple lookups. And practically speaking, kvm_zap_gfn_range() usage is quite rare and not a hot path, e.g. MTRR and CR0.CD are almost guaranteed to be done only on vCPU0 during boot, and APICv inhibits are similarly infrequent operations. Fixes: edb298c663fc ("KVM: x86/mmu: bump mmu notifier count in kvm_zap_gfn_range") Cc: stable@vger.kernel.org Cc: Maxim Levitsky <redacted> Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/mmu/mmu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 6f81539061d6..1ccb769f62af 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c@@ -6056,7 +6056,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) write_lock(&kvm->mmu_lock); - kvm_mmu_invalidate_begin(kvm, gfn_start, gfn_end); + kvm_mmu_invalidate_begin(kvm, 0, -1ul); flush = kvm_rmap_zap_gfn_range(kvm, gfn_start, gfn_end);@@ -6070,7 +6070,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) kvm_flush_remote_tlbs_with_address(kvm, gfn_start, gfn_end - gfn_start); - kvm_mmu_invalidate_end(kvm, gfn_start, gfn_end); + kvm_mmu_invalidate_end(kvm, 0, -1ul); write_unlock(&kvm->mmu_lock); }base-commit: c12879206e47730ff5ab255bbf625b28ade4028f --