Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory
From: Chao Peng <hidden>
Date: 2023-01-06 09:46:36
Also in:
kvm, linux-arch, linux-doc, linux-fsdevel, linux-mm, lkml, qemu-devel
On Thu, Jan 05, 2023 at 11:23:01AM +0000, Jarkko Sakkinen wrote:
On Fri, Dec 02, 2022 at 02:13:41PM +0800, Chao Peng wrote:quoted
In memory encryption usage, guest memory may be encrypted with special key and can be accessed only by the guest itself. We call such memory private memory. It's valueless and sometimes can cause problem to allow userspace to access guest private memory. This new KVM memslot extension allows guest private memory being provided through a restrictedmem backed file descriptor(fd) and userspace is restricted to access the bookmarked memory in the fd. This new extension, indicated by the new flag KVM_MEM_PRIVATE, adds two additional KVM memslot fields restricted_fd/restricted_offset to allow userspace to instruct KVM to provide guest memory through restricted_fd. 'guest_phys_addr' is mapped at the restricted_offset of restricted_fd and the size is 'memory_size'. The extended memslot can still have the userspace_addr(hva). When use, a single memslot can maintain both private memory through restricted_fd and shared memory through userspace_addr. Whether the private or shared part is visible to guest is maintained by other KVM code. A restrictedmem_notifier field is also added to the memslot structure to allow the restricted_fd's backing store to notify KVM the memory change, KVM then can invalidate its page table entries or handle memory errors. Together with the change, a new config HAVE_KVM_RESTRICTED_MEM is added and right now it is selected on X86_64 only. To make future maintenance easy, internally use a binary compatible alias struct kvm_user_mem_region to handle both the normal and the '_ext' variants.Feels bit hacky IMHO, and more like a completely new feature than an extension. Why not just add a new ioctl? The commit message does not address the most essential design here.
Yes, people can always choose to add a new ioctl for this kind of change and the balance point here is we want to also avoid 'too many ioctls' if the functionalities are similar. The '_ext' variant reuses all the existing fields in the 'normal' variant and most importantly KVM internally can reuse most of the code. I certainly can add some words in the commit message to explain this design choice. Thanks, Chao
BR, Jarkko