Thread (148 messages) 148 messages, 14 authors, 2024-04-26

Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory

From: David Matlack <dmatlack@google.com>
Date: 2023-11-02 16:28:56
Also in: kvm, kvm-riscv, kvmarm, linux-arm-kernel, linux-fsdevel, linux-mips, linux-mm, linux-riscv, lkml

On Thu, Nov 2, 2023 at 9:03 AM Sean Christopherson [off-list ref] wrote:
On Thu, Nov 02, 2023, Paolo Bonzini wrote:
quoted
On 10/31/23 23:39, David Matlack wrote:
quoted
quoted
quoted
Maybe can you sketch out how you see this proposal being extensible to
using guest_memfd for shared mappings?
For in-place conversions, e.g. pKVM, no additional guest_memfd is needed.  What's
missing there is the ability to (safely) mmap() guest_memfd, e.g. KVM needs to
ensure there are no outstanding references when converting back to private.

For TDX/SNP, assuming we don't find a performant and robust way to do in-place
conversions, a second fd+offset pair would be needed.
Is there a way to support non-in-place conversions within a single guest_memfd?
For TDX/SNP, you could have a hook from KVM_SET_MEMORY_ATTRIBUTES to guest
memory.  The hook would invalidate now-private parts if they have a VMA,
causing a SIGSEGV/EFAULT if the host touches them.

It would forbid mappings from multiple gfns to a single offset of the
guest_memfd, because then the shared vs. private attribute would be tied to
the offset.  This should not be a problem; for example, in the case of SNP,
the RMP already requires a single mapping from host physical address to
guest physical address.
I don't see how this can work.  It's not a M:1 scenario (where M is multiple gfns),
it's a 1:N scenario (wheren N is multiple offsets).  The *gfn* doesn't change on
a conversion, what needs to change to do non-in-place conversion is the pfn, which
is effectively the guest_memfd+offset pair.

So yes, we *could* support non-in-place conversions within a single guest_memfd,
but it would require a second offset,
Why can't KVM free the existing page at guest_memfd+offset and
allocate a new one when doing non-in-place conversions?
at which point it makes sense to add a
second file descriptor as well.  Userspace could still use a single guest_memfd
instance, i.e. pass in the same file descriptor but different offsets.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help