Re: [PATCH Part2 RFC v4 06/40] x86/sev: Add helper functions for RMPUPDATE and PSMASH instruction
From: Sean Christopherson <seanjc@google.com>
Date: 2021-07-15 19:53:27
Also in:
kvm, linux-coco, linux-efi, linux-mm, lkml, platform-driver-x86
On Thu, Jul 15, 2021, Dave Hansen wrote:
On 7/15/21 11:56 AM, Sean Christopherson wrote:quoted
quoted
quoted
quoted
+ /* Retry if another processor is modifying the RMP entry. */ + do { + /* Binutils version 2.36 supports the PSMASH mnemonic. */ + asm volatile(".byte 0xF3, 0x0F, 0x01, 0xFF" + : "=a"(ret) + : "a"(spa) + : "memory", "cc"); + } while (ret == FAIL_INUSE);Should there be some retry limit here for safety? Or do we know that we'll never be stuck in this loop? Ditto for the loop in rmpupdate.It's probably fine to just leave this. While you could *theoretically* lose this race forever, it's unlikely to happen in practice. If it does, you'll get an easy-to-understand softlockup backtrace which should point here pretty quickly.But should failure here even be tolerated? The TDX cases spin on flows that are _not_ due to (direct) contenion, e.g. a pending interrupt while flushing the cache or lack of randomness when generating a key. In this case, there are two CPUs racing to modify the RMP entry, which implies that the final state of the RMP entry is not deterministic.I was envisioning that two different CPUs could try to smash two *different* 4k physical pages, but collide since they share a 2M page. But, in patch 33, this is called via:quoted
+ write_lock(&kvm->mmu_lock); + + switch (op) { + case SNP_PAGE_STATE_SHARED: + rc = snp_make_page_shared(vcpu, gpa, pfn, level);... Which should make collisions impossible. Did I miss another call-site?
Ya, there's more, e.g. sev_snp_write_page_begin() and snp_handle_rmp_page_fault(), both of which run without holding mmu_lock. The PSMASH operation isn't too concerning, but the associated RMPUDATE is most definitely a concern, e.g. if two vCPUs are trying to access different variants of a page. It's ok if KVM's "response" in such a situation does weird things to the guest, but one of the two operations should "win", which I don't think is guaranteed if multiple RMP violations are racing. I'll circle back to this patch after I've gone through the KVM MMU changes.