Re: [RFC PATCH v12 18/33] KVM: x86/mmu: Handle page fault for private memory
From: Yan Zhao <hidden>
Date: 2023-09-18 01:23:46
Also in:
kvm, kvm-riscv, kvmarm, linux-arm-kernel, linux-fsdevel, linux-mips, linux-mm, linux-riscv, linuxppc-dev, lkml
On Fri, Sep 15, 2023 at 07:26:16AM -0700, Sean Christopherson wrote:
On Fri, Sep 15, 2023, Yan Zhao wrote:quoted
On Wed, Sep 13, 2023 at 06:55:16PM -0700, Sean Christopherson wrote: ....quoted
+static void kvm_mmu_prepare_memory_fault_exit(struct kvm_vcpu *vcpu, + struct kvm_page_fault *fault) +{ + kvm_prepare_memory_fault_exit(vcpu, fault->gfn << PAGE_SHIFT, + PAGE_SIZE, fault->write, fault->exec, + fault->is_private); +} + +static int kvm_faultin_pfn_private(struct kvm_vcpu *vcpu, + struct kvm_page_fault *fault) +{ + int max_order, r; + + if (!kvm_slot_can_be_private(fault->slot)) { + kvm_mmu_prepare_memory_fault_exit(vcpu, fault); + return -EFAULT; + } + + r = kvm_gmem_get_pfn(vcpu->kvm, fault->slot, fault->gfn, &fault->pfn, + &max_order); + if (r) { + kvm_mmu_prepare_memory_fault_exit(vcpu, fault); + return r; + } + + fault->max_level = min(kvm_max_level_for_order(max_order), + fault->max_level); + fault->map_writable = !(fault->slot->flags & KVM_MEM_READONLY); + + return RET_PF_CONTINUE; +} + static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) { struct kvm_memory_slot *slot = fault->slot;@@ -4293,6 +4356,14 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault return RET_PF_EMULATE; } + if (fault->is_private != kvm_mem_is_private(vcpu->kvm, fault->gfn)) {In patch 21, fault->is_private is set as: ".is_private = kvm_mem_is_private(vcpu->kvm, cr2_or_gpa >> PAGE_SHIFT)", then, the inequality here means memory attribute has been updated after last check. So, why an exit to user space for converting is required instead of a mere retry? Or, is it because how .is_private is assigned in patch 21 is subjected to change in future?This. Retrying on SNP or TDX would hang the guest. I suppose we could special
Is this because if the guest access a page in private way (e.g. via private key in TDX), the returned page must be a private page?
case VMs where .is_private is derived from the memory attributes, but the SW_PROTECTED_VM type is primary a development vehicle at this point. I'd like to have it mimic SNP/TDX as much as possible; performance is a secondary concern.
Ok. But this mimic is somewhat confusing as it may be problematic in below scenario,
though sane guest should ensure no one is accessing a page before doing memory
conversion.
CPU 0 CPU 1
access GFN A in private way
fault->is_private=true
convert GFN A to shared
set memory attribute of A to shared
faultin, mismatch and exit
set memory attribute of A
to private
vCPU access GFN A in shared way
fault->is_private = true
faultin, match and map a private PFN B
vCPU accesses private PFN B in shared way
E.g. userspace needs to be prepared for "spurious" exits due to races on SNP and TDX, which this can theoretically exercise. Though the window is quite small so I doubt that'll actually happen in practice; which of course also makes it less important to retry instead of exiting.