Thread (10 messages) 10 messages, 4 authors, 2024-10-23

Re: [PATCH RFC v2 0/5] mm: Introduce guest_memfd library

From: Paolo Bonzini <pbonzini@redhat.com>
Date: 2024-10-10 13:04:34
Also in: kvm, linux-arm-msm, linux-mm, lkml

On 8/30/24 00:24, Elliot Berman wrote:
In preparation for adding more features to KVM's guest_memfd, refactor
and introduce a library which abstracts some of the core-mm decisions
about managing folios associated with the file. The goal of the refactor
serves two purposes:

1. Provide an easier way to reason about memory in guest_memfd. With KVM
supporting multiple confidentiality models (TDX, SEV-SNP, pKVM, ARM
CCA), and coming support for allowing kernel and userspace to access
this memory, it seems necessary to create a stronger abstraction between
core-mm concerns and hypervisor concerns.

2. Provide a common implementation for other hypervisors (Gunyah) to use.

To create a guest_memfd, the owner provides operations to attempt to
unmap the folio and check whether a folio is accessible to the host. The
owner can call guest_memfd_make_inaccessible() to ensure Linux doesn't
have the folio mapped.

The series first introduces a guest_memfd library based on the current
KVM (next) implementation, then adds few features needed for Gunyah and
arm64 pKVM. The Gunyah usage of the series will be posted separately
shortly after sending this series. I'll work with Fuad on using the
guest_memfd library for arm64 pKVM based on the feedback received.

There are a few TODOs still pending.
- The KVM patch isn't tested. I don't have access a SEV-SNP setup to be
   able to test.
- I've not yet investigated deeply whether having the guest_memfd
   library helps live migration. I'd appreciate any input on that part.
- We should consider consolidating the adjust_direct_map() in
   arch/x86/virt/svm/sev.c so guest_memfd can take care of it.
- There's a race possibility where the folio ref count is incremented
   and about to also increment the safe counter, but waiting for the
   folio lock to be released. The owner of folio_lock will see mismatched
   counter values and not be able to convert to (in)accessible, even
   though it should be okay to do so.
  
I'd appreciate any feedback, especially on the direction I'm taking for
tracking the (in)accessible state.

Signed-off-by: Elliot Berman <redacted>

Changes in v2:
- Significantly reworked to introduce "accessible" and "safe" reference
   counters
Was there any discussion on this change?  If not, can you explain it a 
bit more since it's the biggest change compared to the KVM design?  I 
suppose the reference counting is used in relation to mmap, but it would 
be nice to have a few more words on how the counts are used and an 
explanation of when (especially) the accessible atomic_t can take any 
value other than 0/1.

As an aside, allocating 8 bytes of per-folio private memory (and 
dereferencing the pointer, too) is a bit of a waste considering that the 
private pointer itself is 64 bits on all platforms of interest.

Paolo
- Link to v1:
   https://lore.kernel.org/r/20240805-guest-memfd-lib-v1-0-e5a29a4ff5d7@quicinc.com (local)

---
Elliot Berman (5):
       mm: Introduce guest_memfd
       mm: guest_memfd: Allow folios to be accessible to host
       kvm: Convert to use guest_memfd library
       mm: guest_memfd: Add ability for userspace to mmap pages
       mm: guest_memfd: Add option to remove inaccessible memory from direct map

  arch/x86/kvm/svm/sev.c      |   3 +-
  include/linux/guest_memfd.h |  49 ++++
  mm/Kconfig                  |   3 +
  mm/Makefile                 |   1 +
  mm/guest_memfd.c            | 667 ++++++++++++++++++++++++++++++++++++++++++++
I think I'd rather have this in virt/lib.

Paolo
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help