Thread (58 messages) 58 messages, 10 authors, 2022-08-25

Re: [PATCH v6 0/8] KVM: mm: fd-based approach for supporting KVM guest private memory

From: Chao Peng <hidden>
Date: 2022-06-07 07:01:48
Also in: kvm, linux-doc, linux-fsdevel, linux-mm, lkml, qemu-devel

On Mon, Jun 06, 2022 at 01:09:50PM -0700, Vishal Annapurve wrote:
quoted
Private memory map/unmap and conversion
---------------------------------------
Userspace's map/unmap operations are done by fallocate() ioctl on the
backing store fd.
  - map: default fallocate() with mode=0.
  - unmap: fallocate() with FALLOC_FL_PUNCH_HOLE.
The map/unmap will trigger above memfile_notifier_ops to let KVM map/unmap
secondary MMU page tables.
....
quoted
   QEMU: https://github.com/chao-p/qemu/tree/privmem-v6

An example QEMU command line for TDX test:
-object tdx-guest,id=tdx \
-object memory-backend-memfd-private,id=ram1,size=2G \
-machine q35,kvm-type=tdx,pic=no,kernel_irqchip=split,memory-encryption=tdx,memory-backend=ram1
There should be more discussion around double allocation scenarios
when using the private fd approach. A malicious guest or buggy
userspace VMM can cause physical memory getting allocated for both
shared (memory accessible from host) and private fds backing the guest
memory.
Userspace VMM will need to unback the shared guest memory while
handling the conversion from shared to private in order to prevent
double allocation even with malicious guests or bugs in userspace VMM.
I don't know how malicious guest can cause that. The initial design of
this serie is to put the private/shared memory into two different
address spaces and gives usersapce VMM the flexibility to convert
between the two. It can choose respect the guest conversion request or
not.

It's possible for a usrspace VMM to cause double allocation if it fails
to call the unback operation during the conversion, this may be a bug
or not. Double allocation may not be a wrong thing, even in conception.
At least TDX allows you to use half shared half private in guest, means
both shared/private can be effective. Unbacking the memory is just the
current QEMU implementation choice.

Chao
Options to unback shared guest memory seem to be:
1) madvise(.., MADV_DONTNEED/MADV_REMOVE) - This option won't stop
kernel from backing the shared memory on subsequent write accesses
2) fallocate(..., FALLOC_FL_PUNCH_HOLE...) - For file backed shared
guest memory, this option still is similar to madvice since this would
still allow shared memory to get backed on write accesses
3) munmap - This would give away the contiguous virtual memory region
reservation with holes in the guest backing memory, which might make
guest memory management difficult.
4) mprotect(... PROT_NONE) - This would keep the virtual memory
address range backing the guest memory preserved

ram_block_discard_range_fd from reference implementation:
https://github.com/chao-p/qemu/tree/privmem-v6 seems to be relying on
fallocate/madvise.

Any thoughts/suggestions around better ways to unback the shared
memory in order to avoid double allocation scenarios?

Regards,
Vishal
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help