Thread (16 messages) 16 messages, 5 authors, 2021-08-20

Re: [PATCH v3 2/5] KVM: x86: invert KVM_HYPERCALL to default to VMMCALL

From: "Kalra, Ashish" <Ashish.Kalra@amd.com>
Date: 2021-08-19 22:09:01
Also in: kvm, lkml

Hello Sean,
On Aug 20, 2021, at 2:15 AM, Sean Christopherson [off-list ref] wrote:

Preferred shortlog prefix for KVM guest changes is "x86/kvm".  "KVM: x86" is for
host changes.
quoted
On Tue, Jun 08, 2021, Ashish Kalra wrote:
From: Ashish Kalra <ashish.kalra@amd.com>

KVM hypercall framework relies on alternative framework to patch the
VMCALL -> VMMCALL on AMD platform. If a hypercall is made before
apply_alternative() is called then it defaults to VMCALL. The approach
works fine on non SEV guest. A VMCALL would causes #UD, and hypervisor
will be able to decode the instruction and do the right things. But
when SEV is active, guest memory is encrypted with guest key and
hypervisor will not be able to decode the instruction bytes.

So invert KVM_HYPERCALL and X86_FEATURE_VMMCALL to default to VMMCALL
and opt into VMCALL.
The changelog needs to explain why SEV hypercalls need to be made before
apply_alternative(), why it's ok to make Intel CPUs take #UDs on the unknown
VMMCALL, and why this is not creating the same conundrum for TDX.
I think it makes more sense to stick to the original approach/patch, i.e., introducing a new private hypercall interface like kvm_sev_hypercall3() and let early paravirtualized kernel code invoke this private hypercall interface wherever required.

This helps avoiding Intel CPUs taking unnecessary #UDs and also avoid using hacks as below.

TDX code can introduce similar private hypercall interface for their early para virtualized kernel code if required.
Actually, I don't think making Intel CPUs take #UDs is acceptable.  This patch
breaks Linux on upstream KVM on Intel due a bug in upstream KVM.  KVM attempts
to patch the "wrong" hypercall to the "right" hypercall, but stupidly does so
via an emulated write.  I.e. KVM honors the guest page table permissions and
injects a !WRITABLE #PF on the VMMCALL RIP if the kernel code is mapped RX.

In other words, trusting the VMM to not screw up the #UD is a bad idea.  This also
makes documenting the "why does SEV need super early hypercalls" extra important.
Makes sense.

Thanks,
Ashish
quoted hunk ↗ jump to hunk
This patch doesn't work because X86_FEATURE_VMCALL is a synthetic flag and is
only set by VMware paravirt code, which is why the patching doesn't happen as
would be expected.  The obvious solution would be to manually set X86_FEATURE_VMCALL
where appropriate, but given that defaulting to VMCALL has worked for years,
defaulting to VMMCALL makes me nervous, e.g. even if we splatter X86_FEATURE_VMCALL
into Intel, Centaur, and Zhaoxin, there's a possibility we'll break existing VMs
that run on hypervisors that do something weird with the vendor string.

Rather than look for X86_FEATURE_VMCALL, I think it makes sense to have this be
a "pure" inversion, i.e. patch in VMCALL if VMMCALL is not supported, as opposed
to patching in VMCALL if VMCALL is supproted.
diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 69299878b200..61641e69cfda 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -17,7 +17,7 @@ static inline bool kvm_check_and_clear_guest_paused(void)
#endif /* CONFIG_KVM_GUEST */

#define KVM_HYPERCALL \
-        ALTERNATIVE("vmcall", "vmmcall", X86_FEATURE_VMMCALL)
+        ALTERNATIVE("vmmcall", "vmcall", ALT_NOT(X86_FEATURE_VMMCALL))

/* For KVM hypercalls, a three-byte sequence of either the vmcall or the vmmcall
 * instruction.  The hypervisor may replace it with something else but only the
quoted
Cc: Thomas Gleixner <redacted>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <redacted>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Suggested-by: Sean Christopherson <seanjc@google.com>
quoted
Signed-off-by: Brijesh Singh <redacted>
Is Brijesh the author?  Co-developed-by for a one-line change would be odd...
quoted
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
arch/x86/include/asm/kvm_para.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 69299878b200..0267bebb0b0f 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -17,7 +17,7 @@ static inline bool kvm_check_and_clear_guest_paused(void)
#endif /* CONFIG_KVM_GUEST */

#define KVM_HYPERCALL \
-        ALTERNATIVE("vmcall", "vmmcall", X86_FEATURE_VMMCALL)
+    ALTERNATIVE("vmmcall", "vmcall", X86_FEATURE_VMCALL)

/* For KVM hypercalls, a three-byte sequence of either the vmcall or the vmmcall
 * instruction.  The hypervisor may replace it with something else but only the
-- 
2.17.1
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help