Re: [PATCH v9 3/8] KVM: Add KVM_EXIT_MEMORY_FAULT exit
From: Chao Peng <hidden>
Date: 2022-11-18 01:37:28
Also in:
kvm, linux-arch, linux-doc, linux-fsdevel, linux-mm, lkml, qemu-devel
On Thu, Nov 17, 2022 at 03:08:17PM +0000, Alex Bennée wrote:
Chao Peng [off-list ref] writes:quoted
On Wed, Nov 16, 2022 at 07:03:49PM +0000, Alex Bennée wrote:quoted
Chao Peng [off-list ref] writes:quoted
On Tue, Nov 15, 2022 at 04:56:12PM +0000, Alex Bennée wrote:quoted
Chao Peng [off-list ref] writes:quoted
This new KVM exit allows userspace to handle memory-related errors. It indicates an error happens in KVM at guest memory range [gpa, gpa+size). The flags includes additional information for userspace to handle the error. Currently bit 0 is defined as 'private memory' where '1' indicates error happens due to private memory access and '0' indicates error happens due to shared memory access. When private memory is enabled, this new exit will be used for KVM to exit to userspace for shared <-> private memory conversion in memory encryption usage. In such usage, typically there are two kind of memory conversions: - explicit conversion: happens when guest explicitly calls into KVM to map a range (as private or shared), KVM then exits to userspace to perform the map/unmap operations. - implicit conversion: happens in KVM page fault handler where KVM exits to userspace for an implicit conversion when the page is in a different state than requested (private or shared). Suggested-by: Sean Christopherson <seanjc@google.com> Co-developed-by: Yu Zhang <redacted> Signed-off-by: Yu Zhang <redacted> Signed-off-by: Chao Peng <redacted> --- Documentation/virt/kvm/api.rst | 23 +++++++++++++++++++++++ include/uapi/linux/kvm.h | 9 +++++++++ 2 files changed, 32 insertions(+)diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index f3fa75649a78..975688912b8c 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst@@ -6537,6 +6537,29 @@ array field represents return values. The userspace should update the return values of SBI call before resuming the VCPU. For more details on RISC-V SBI spec refer, https://github.com/riscv/riscv-sbi-doc. +:: + + /* KVM_EXIT_MEMORY_FAULT */ + struct { + #define KVM_MEMORY_EXIT_FLAG_PRIVATE (1 << 0) + __u32 flags; + __u32 padding; + __u64 gpa; + __u64 size; + } memory; + +If exit reason is KVM_EXIT_MEMORY_FAULT then it indicates that the VCPU has +encountered a memory error which is not handled by KVM kernel module and +userspace may choose to handle it. The 'flags' field indicates the memory +properties of the exit. + + - KVM_MEMORY_EXIT_FLAG_PRIVATE - indicates the memory error is caused by + private memory access when the bit is set. Otherwise the memory error is + caused by shared memory access when the bit is clear.What does a shared memory access failure entail?In the context of confidential computing usages, guest can issue a shared memory access while the memory is actually private from the host point of view. This exit with bit 0 cleared gives userspace a chance to convert the private memory to shared memory on host.I think this should be explicit rather than implied by the absence of another flag. Sean suggested you might want flags for RWX failures so maybe something like: KVM_MEMORY_EXIT_SHARED_FLAG_READ (1 << 0) KVM_MEMORY_EXIT_SHARED_FLAG_WRITE (1 << 1) KVM_MEMORY_EXIT_SHARED_FLAG_EXECUTE (1 << 2) KVM_MEMORY_EXIT_FLAG_PRIVATE (1 << 3)Yes, but I would not add 'SHARED' to RWX, they are not share memory specific, private memory can also set them once introduced.OK so how about: KVM_MEMORY_EXIT_FLAG_READ (1 << 0) KVM_MEMORY_EXIT_FLAG_WRITE (1 << 1) KVM_MEMORY_EXIT_FLAG_EXECUTE (1 << 2) KVM_MEMORY_EXIT_FLAG_SHARED (1 << 3) KVM_MEMORY_EXIT_FLAG_PRIVATE (1 << 4)
We don't actually need a new bit, the opposite side of private is shared, i.e. flags with KVM_MEMORY_EXIT_FLAG_PRIVATE cleared expresses 'shared'. Chao
quoted
Thanks, Chaoquoted
which would allow you to signal the various failure modes of the shared region, or that you had accessed private memory.quoted
quoted
If you envision any other failure modes it might be worth making it explicit with additional flags.Sean mentioned some more usages[1][]2] other than the memory conversion for confidential usage. But I would leave those flags being added in the future after those usages being well discussed. [1] https://lkml.kernel.org/r/20200617230052.GB27751@linux.intel.com [2] https://lore.kernel.org/all/YKxJLcg%2FWomPE422@google.com (local)quoted
I also wonder if a bitmask makes sense if there can only be one reason for a failure? Maybe all that is needed is a reason enum?Tough we only have one reason right now but we still want to leave room for future extension. Enum can express a single value at once well but bitmask makes it possible to express multiple orthogonal flags.I agree if multiple orthogonal failures can occur at once a bitmask is the right choice.quoted
Chaoquoted
quoted
+ +'gpa' and 'size' indicate the memory range the error occurs at. The userspace +may handle the error and return to KVM to retry the previous memory access. + :: /* KVM_EXIT_NOTIFY */diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index f1ae45c10c94..fa60b032a405 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h@@ -300,6 +300,7 @@ struct kvm_xen_exit { #define KVM_EXIT_RISCV_SBI 35 #define KVM_EXIT_RISCV_CSR 36 #define KVM_EXIT_NOTIFY 37 +#define KVM_EXIT_MEMORY_FAULT 38 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */@@ -538,6 +539,14 @@ struct kvm_run { #define KVM_NOTIFY_CONTEXT_INVALID (1 << 0) __u32 flags; } notify; + /* KVM_EXIT_MEMORY_FAULT */ + struct { +#define KVM_MEMORY_EXIT_FLAG_PRIVATE (1 << 0) + __u32 flags; + __u32 padding; + __u64 gpa; + __u64 size; + } memory; /* Fix the size of the union. */ char padding[256]; };-- Alex Bennée-- Alex Bennée-- Alex Bennée