Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd

[PATCH v8 0/8] KVM: mm: fd-based approach for supporting KVM · Chao Peng <hidden> · 2022-09-15
[PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Chao Peng <hidden> · 2022-09-15
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · David Hildenbrand <hidden> · 2022-09-19
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Sean Christopherson <seanjc@google.com> · 2022-09-19
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · "Andy Lutomirski" <luto@kernel.org> · 2022-09-21
RE: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Wang, Wei W <hidden> · 2022-09-22
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Fuad Tabba <hidden> · 2022-09-23
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Fuad Tabba <hidden> · 2022-09-23
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Chao Peng <hidden> · 2022-09-26
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Fuad Tabba <hidden> · 2022-09-26
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Sean Christopherson <seanjc@google.com> · 2022-09-27
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Fuad Tabba <hidden> · 2022-09-30
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Chao Peng <hidden> · 2022-10-13
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Fuad Tabba <hidden> · 2022-10-17
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Chao Peng <hidden> · 2022-10-17
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Fuad Tabba <hidden> · 2022-10-17
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Chao Peng <hidden> · 2022-10-19
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Sean Christopherson <seanjc@google.com> · 2022-10-18
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Fuad Tabba <hidden> · 2022-10-19
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Kirill A . Shutemov <hidden> · 2022-09-23
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · David Hildenbrand <hidden> · 2022-09-26
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Kirill A. Shutemov <hidden> · 2022-09-26
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · David Hildenbrand <hidden> · 2022-09-26
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Sean Christopherson <seanjc@google.com> · 2022-09-27
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Kirill A. Shutemov <hidden> · 2022-09-28
RE: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Wang, Wei W <hidden> · 2022-09-22
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Sean Christopherson <seanjc@google.com> · 2022-09-22
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Kirill A . Shutemov <hidden> · 2022-09-23
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Fuad Tabba <hidden> · 2022-09-23
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Fuad Tabba <hidden> · 2022-09-30
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Kirill A . Shutemov <hidden> · 2022-09-30
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Fuad Tabba <hidden> · 2022-10-03
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Kirill A. Shutemov <hidden> · 2022-10-03
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Fuad Tabba <hidden> · 2022-10-04
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Fuad Tabba <hidden> · 2022-10-06
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Kirill A. Shutemov <hidden> · 2022-10-06
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Vlastimil Babka <hidden> · 2022-10-17
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Kirill A . Shutemov <hidden> · 2022-10-17
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Gupta, Pankaj <hidden> · 2022-10-17
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Kirill A . Shutemov <hidden> · 2022-10-17
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Vishal Annapurve <hidden> · 2022-10-18
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Kirill A . Shutemov <hidden> · 2022-10-19
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Vishal Annapurve <hidden> · 2022-10-20
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Chao Peng <hidden> · 2022-10-21
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Sean Christopherson <seanjc@google.com> · 2022-10-21
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Vishal Annapurve <hidden> · 2022-10-19
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Chao Peng <hidden> · 2022-10-21
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Sean Christopherson <seanjc@google.com> · 2022-10-21
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Kirill A . Shutemov <hidden> · 2022-10-24
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · David Hildenbrand <hidden> · 2022-10-24
Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd · Vishal Annapurve <hidden> · 2022-11-03
[PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Chao Peng <hidden> · 2022-09-15
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Bagas Sanjaya <hidden> · 2022-09-16
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Chao Peng <hidden> · 2022-09-16
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Fuad Tabba <hidden> · 2022-09-26
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Chao Peng <hidden> · 2022-09-26
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Isaku Yamahata <hidden> · 2022-09-29
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Sean Christopherson <seanjc@google.com> · 2022-09-29
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Jarkko Sakkinen <jarkko@kernel.org> · 2022-10-05
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Jarkko Sakkinen <jarkko@kernel.org> · 2022-10-05
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Fuad Tabba <hidden> · 2022-10-06
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Jarkko Sakkinen <jarkko@kernel.org> · 2022-10-06
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Jarkko Sakkinen <jarkko@kernel.org> · 2022-10-06
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Sean Christopherson <seanjc@google.com> · 2022-10-06
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Jarkko Sakkinen <jarkko@kernel.org> · 2022-10-07
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Sean Christopherson <seanjc@google.com> · 2022-10-07
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Jarkko Sakkinen <jarkko@kernel.org> · 2022-10-07
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Jarkko Sakkinen <jarkko@kernel.org> · 2022-10-08
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Jarkko Sakkinen <jarkko@kernel.org> · 2022-10-08
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Chao Peng <hidden> · 2022-10-10
Re: [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory · Jarkko Sakkinen <jarkko@kernel.org> · 2022-10-12
[PATCH v8 3/8] KVM: Add KVM_EXIT_MEMORY_FAULT exit · Chao Peng <hidden> · 2022-09-15
Re: [PATCH v8 3/8] KVM: Add KVM_EXIT_MEMORY_FAULT exit · Bagas Sanjaya <hidden> · 2022-09-16
Re: [PATCH v8 3/8] KVM: Add KVM_EXIT_MEMORY_FAULT exit · Chao Peng <hidden> · 2022-09-16
[PATCH v8 4/8] KVM: Use gfn instead of hva for mmu_notifier_retry · Chao Peng <hidden> · 2022-09-15
[PATCH v8 5/8] KVM: Register/unregister the guest private memory regions · Chao Peng <hidden> · 2022-09-15
Re: [PATCH v8 5/8] KVM: Register/unregister the guest private memory regions · Fuad Tabba <hidden> · 2022-09-26
Re: [PATCH v8 5/8] KVM: Register/unregister the guest private memory regions · Chao Peng <hidden> · 2022-09-26
Re: [PATCH v8 5/8] KVM: Register/unregister the guest private memory regions · Fuad Tabba <hidden> · 2022-10-11
Re: [PATCH v8 5/8] KVM: Register/unregister the guest private memory regions · Chao Peng <hidden> · 2022-10-12
Re: [PATCH v8 5/8] KVM: Register/unregister the guest private memory regions · Fuad Tabba <hidden> · 2022-10-17
Re: [PATCH v8 5/8] KVM: Register/unregister the guest private memory regions · Sean Christopherson <seanjc@google.com> · 2022-10-17
Re: [PATCH v8 5/8] KVM: Register/unregister the guest private memory regions · Chao Peng <hidden> · 2022-10-19
Re: [PATCH v8 5/8] KVM: Register/unregister the guest private memory regions · Fuad Tabba <hidden> · 2022-10-19
Re: [PATCH v8 5/8] KVM: Register/unregister the guest private memory regions · Sean Christopherson <seanjc@google.com> · 2022-10-19
Re: [PATCH v8 5/8] KVM: Register/unregister the guest private memory regions · Fuad Tabba <hidden> · 2022-10-19
[PATCH v8 6/8] KVM: Update lpage info when private/shared memory are mixed · Chao Peng <hidden> · 2022-09-15
Re: [PATCH v8 6/8] KVM: Update lpage info when private/shared memory are mixed · Isaku Yamahata <hidden> · 2022-09-29
Re: [PATCH v8 6/8] KVM: Update lpage info when private/shared memory are mixed · Chao Peng <hidden> · 2022-09-30
[PATCH v8 7/8] KVM: Handle page fault for private memory · Chao Peng <hidden> · 2022-09-15
Re: [PATCH v8 7/8] KVM: Handle page fault for private memory · Sean Christopherson <seanjc@google.com> · 2022-10-14
Re: [PATCH v8 7/8] KVM: Handle page fault for private memory · Chao Peng <hidden> · 2022-10-17
[PATCH v8 8/8] KVM: Enable and expose KVM_MEM_PRIVATE · Chao Peng <hidden> · 2022-09-15
Re: [PATCH v8 8/8] KVM: Enable and expose KVM_MEM_PRIVATE · Jarkko Sakkinen <jarkko@kernel.org> · 2022-10-04
Re: [PATCH v8 8/8] KVM: Enable and expose KVM_MEM_PRIVATE · Chao Peng <hidden> · 2022-10-10
Re: [PATCH v8 8/8] KVM: Enable and expose KVM_MEM_PRIVATE · Fuad Tabba <hidden> · 2022-10-06
Re: [PATCH v8 8/8] KVM: Enable and expose KVM_MEM_PRIVATE · Chao Peng <hidden> · 2022-10-10

From: Sean Christopherson <seanjc@google.com>
Date: 2022-10-21 16:53:56
Also in: kvm, linux-doc, linux-fsdevel, linux-mm, lkml, qemu-devel

On Fri, Oct 21, 2022, Chao Peng wrote:

On Thu, Oct 20, 2022 at 04:20:58PM +0530, Vishal Annapurve wrote:

quoted

On Wed, Oct 19, 2022 at 9:02 PM Kirill A . Shutemov [off-list ref] wrote:

quoted

On Tue, Oct 18, 2022 at 07:12:10PM +0530, Vishal Annapurve wrote:

quoted

I think moving this notifier_invalidate before fallocate may not solve
the problem completely. Is it possible that between invalidate and
fallocate, KVM tries to handle the page fault for the guest VM from
another vcpu and uses the pages to be freed to back gpa ranges? Should
hole punching here also update mem_attr first to say that KVM should
consider the corresponding gpa ranges to be no more backed by
inaccessible memfd?

We rely on external synchronization to prevent this. See code around
mmu_invalidate_retry_hva().

--
  Kiryl Shutsemau / Kirill A. Shutemov

IIUC, mmu_invalidate_retry_hva/gfn ensures that page faults on gfn
ranges that are being invalidated are retried till invalidation is
complete. In this case, is it possible that KVM tries to serve the
page fault after inaccessible_notifier_invalidate is complete but
before fallocate could punch hole into the files?

It's not just the page fault edge case.  In the more straightforward scenario
where the memory is already mapped into the guest, freeing pages back to the kernel
before they are removed from the guest will lead to use-after-free.

quoted

e.g.
inaccessible_notifier_invalidate(...)
... (system event preempting this control flow, giving a window for
the guest to retry accessing the gfn range which was invalidated)
fallocate(.., PUNCH_HOLE..)

Looks this is something can happen.
And sounds to me the solution needs
just follow the mmu_notifier's way of using a invalidate_start/end pair.

  invalidate_start()  --> kvm->mmu_invalidate_in_progress++;
                          zap KVM page table entries;
  fallocate()
  invalidate_end()  --> kvm->mmu_invalidate_in_progress--;

Then during invalidate_start/end time window mmu_invalidate_retry_gfn
checks 'mmu_invalidate_in_progress' and prevent repopulating the same
page in KVM page table.

Yes, if it's not safe to invalidate after making the change (fallocate()), then
the change needs to be bookended by a start+end pair.  The mmu_notifier's unpaired
invalidate() hook works by zapping the primary MMU's PTEs before invalidate(), but
frees the underlying physical page _after_ invalidate().

And the only reason the unpaired invalidate() exists is because there are secondary
MMUs that reuse the primary MMU's page tables, e.g. shared virtual addressing, in
which case bookending doesn't work because the secondary MMU can't remove PTEs, it
can only flush its TLBs.

For this case, the whole point is to not create PTEs in the primary MMU, so there
should never be a use case that _needs_ an unpaired invalidate().

TL;DR: a start+end pair is likely the simplest solution.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help