Re: [PATCH v10 9/9] KVM: Enable and expose KVM_MEM_PRIVATE
From: Sean Christopherson <seanjc@google.com>
Date: 2023-01-14 00:01:11
Also in:
kvm, linux-arch, linux-doc, linux-fsdevel, linux-mm, lkml, qemu-devel
On Fri, Dec 02, 2022, Chao Peng wrote:
quoted hunk ↗ jump to hunk
@@ -10357,6 +10364,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) if (kvm_check_request(KVM_REQ_UPDATE_CPU_DIRTY_LOGGING, vcpu)) static_call(kvm_x86_update_cpu_dirty_logging)(vcpu); + + if (kvm_check_request(KVM_REQ_MEMORY_MCE, vcpu)) { + vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
Synthesizing triple fault shutdown is not the right approach. Even with TDX's MCE "architecture" (heavy sarcasm), it's possible that host userspace and the guest have a paravirt interface for handling memory errors without killing the host.
+ r = 0; + goto out; + } }
quoted hunk ↗ jump to hunk
@@ -1982,6 +2112,10 @@ int __kvm_set_memory_region(struct kvm *kvm, !access_ok((void __user *)(unsigned long)mem->userspace_addr, mem->memory_size)) return -EINVAL; + if (mem->flags & KVM_MEM_PRIVATE && + (mem->restricted_offset & (PAGE_SIZE - 1) ||
Align indentation.
+ mem->restricted_offset > U64_MAX - mem->memory_size))
Strongly prefer to use similar logic to existing code that detects wraps: mem->restricted_offset + mem->memory_size < mem->restricted_offset This is also where I'd like to add the "gfn is aligned to offset" check, though my brain is too fried to figure that out right now.
quoted hunk ↗ jump to hunk
+ return -EINVAL; if (as_id >= KVM_ADDRESS_SPACE_NUM || id >= KVM_MEM_SLOTS_NUM) return -EINVAL; if (mem->guest_phys_addr + mem->memory_size < mem->guest_phys_addr)@@ -2020,6 +2154,9 @@ int __kvm_set_memory_region(struct kvm *kvm, if ((kvm->nr_memslot_pages + npages) < kvm->nr_memslot_pages) return -EINVAL; } else { /* Modify an existing slot. */ + /* Private memslots are immutable, they can only be deleted. */
I'm 99% certain I suggested this, but if we're going to make these memslots immutable, then we should straight up disallow dirty logging, otherwise we'll end up with a bizarre uAPI.
quoted hunk ↗ jump to hunk
+ if (mem->flags & KVM_MEM_PRIVATE) + return -EINVAL; if ((mem->userspace_addr != old->userspace_addr) || (npages != old->npages) || ((mem->flags ^ old->flags) & KVM_MEM_READONLY))@@ -2048,10 +2185,28 @@ int __kvm_set_memory_region(struct kvm *kvm, new->npages = npages; new->flags = mem->flags; new->userspace_addr = mem->userspace_addr; + if (mem->flags & KVM_MEM_PRIVATE) { + new->restricted_file = fget(mem->restricted_fd); + if (!new->restricted_file || + !file_is_restrictedmem(new->restricted_file)) { + r = -EINVAL; + goto out; + } + new->restricted_offset = mem->restricted_offset; + } + + new->kvm = kvm;
Set this above, just so that the code flows better.