Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes
From: Sean Christopherson <seanjc@google.com>
Date: 2023-05-19 19:58:27
Also in:
kvm, linux-arch, linux-doc, linux-fsdevel, linux-mm, lkml, qemu-devel
On Fri, May 19, 2023, Nicolas Saenz Julienne wrote:
Hi Sean, On Fri May 19, 2023 at 6:23 PM UTC, Sean Christopherson wrote:quoted
On Fri, May 19, 2023, Nicolas Saenz Julienne wrote:quoted
Hi, On Fri Dec 2, 2022 at 6:13 AM UTC, Chao Peng wrote: [...]quoted
+The user sets the per-page memory attributes to a guest memory range indicated +by address/size, and in return KVM adjusts address and size to reflect the +actual pages of the memory range have been successfully set to the attributes. +If the call returns 0, "address" is updated to the last successful address + 1 +and "size" is updated to the remaining address size that has not been set +successfully. The user should check the return value as well as the size to +decide if the operation succeeded for the whole range or not. The user may want +to retry the operation with the returned address/size if the previous range was +partially successful. + +Both address and size should be page aligned and the supported attributes can be +retrieved with KVM_GET_SUPPORTED_MEMORY_ATTRIBUTES. + +The "flags" field may be used for future extensions and should be set to 0s.We have been looking into adding support for the Hyper-V VSM extensions which Windows uses to implement Credential Guard. This interface seems like a good fit for one of its underlying features. I just wanted to share a bit about it, and see if we can expand it to fit this use-case. Note that this was already briefly discussed between Sean and Alex some time ago[1]. VSM introduces isolated guest execution contexts called Virtual Trust Levels (VTL) [2]. Each VTL has its own memory access protections, virtual processors states, interrupt controllers and overlay pages. VTLs are hierarchical and might enforce memory protections on less privileged VTLs. Memory protections are enforced on a per-GPA granularity. The list of possible protections is: - No access -- This needs a new memory attribute, I think.No, if KVM provides three bits for READ, WRITE, and EXECUTE, then userspace can get all the possible combinations. E.g. this is RWX=000bThat's not what the current implementation does, when attributes is equal 0 it clears the entries from the xarray: static int kvm_vm_ioctl_set_mem_attributes(struct kvm *kvm, struct kvm_memory_attributes *attrs) { entry = attrs->attributes ? xa_mk_value(attrs->attributes) : NULL; [...] for (i = start; i < end; i++) if (xa_err(xa_store(&kvm->mem_attr_array, i, entry, GFP_KERNEL_ACCOUNT))) break; }quoted
From Documentation/core-api/xarray.rst:"There is no difference between an entry that has never been stored to, one that has been erased and one that has most recently had ``NULL`` stored to it." The way I understood the series, there needs to be a differentiation between no attributes (regular page fault) and no-access.
Ah, I see what you're saying. There are multiple ways to solve things without a
"no access" flag while still maintaining an empty xarray for the default case.
E.g. invert the flags to be DENY flags[*], have an internal-only "entry valid" flag,
etc.
[*] I vaguely recall suggesting a "deny" approach somewhere, but I may just be
making things up to make it look like I thought deeply about this ;-)