Re: [PATCH Part2 RFC v4 35/40] KVM: Add arch hooks to track the host write to guest memory
From: Brijesh Singh <hidden>
Date: 2021-07-20 15:54:04
Also in:
linux-coco, linux-crypto, linux-efi, linux-mm, lkml, platform-driver-x86
On 7/19/21 6:30 PM, Sean Christopherson wrote: ...>
NAK on converting RMP entries in response to guest accesses. Corrupting guest data (due to dropping the "validated" flag) on a rogue/incorrect guest emulation request or misconfigured PV feature is double ungood. The potential kernel panic below isn't much better.
I also debated myself whether its okay to transition the page state to shared to complete the write operation. I am good with removing the converting RMP entries from the patch, and that will also remove the kernel panic code.
And I also don't think we need this heavyweight flow for user access, e.g. __copy_to_user(), just eat the RMP violation #PF like all other #PFs and exit to userspace with -EFAULT.
Yes, we could improve the __copy_to_user() to eat the RMP violation. I was tempted to go on that path but struggled to find a strong reason for it and was not sure if that accepted. I can add that support in next rev.
kvm_vcpu_map() and friends might need the manual lookup, at least initially,
Yes, the enhancement to the __copy_to_user() does not solve this problem and for it we need to do the manually lookup. but
in an ideal world that would be naturally handled by gup(), e.g. by unmapping guest private memory or whatever approach TDX ends up employing to avoid #MCs.
quoted
+ */ + if (rmpentry_assigned(e)) { + pr_err("SEV-SNP: write to guest private gfn %llx\n", gfn); + rc = snp_make_page_shared(kvm_get_vcpu(kvm, 0), + gfn << PAGE_SHIFT, pfn, PG_LEVEL_4K); + BUG_ON(rc != 0); + } +}...quoted
+void kvm_arch_write_gfn_begin(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn) +{ + update_gfn_track(slot, gfn, KVM_PAGE_TRACK_WRITE, 1);Tracking only writes isn't correct, as KVM reads to guest private memory will return garbage. Pulling the rug out from under KVM reads won't fail as spectacularly as writes (at least not right away), but they'll still fail. I'm actually ok reading garbage if the guest screws up, but KVM needs consistent semantics. Good news is that per-gfn tracking is probably overkill anyways. As mentioned above, user access don't need extra magic, they either fail or they don't. For kvm_vcpu_map(), one thought would be to add a "short-term" map variant that is not allowed to be retained across VM-Entry, and then use e.g. SRCU to block PSC requests until there are no consumers.
Sounds good to me, i will add the support in the next rev. thanks