Re: [PATCH v5 03/26] x86/hyperv: Update 'struct hv_enlightened_vmcs' definition
From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: 2022-08-22 16:22:02
Also in:
kvm, lkml
Sean Christopherson [off-list ref] writes:
On Mon, Aug 22, 2022, Vitaly Kuznetsov wrote:quoted
Sean Christopherson [off-list ref] writes:quoted
On Thu, Aug 18, 2022, Vitaly Kuznetsov wrote:quoted
Sean Christopherson [off-list ref] writes:quoted
On Tue, Aug 02, 2022, Vitaly Kuznetsov wrote:quoted
+ * Note: HV_X64_NESTED_EVMCS1_2022_UPDATE is not currently documented in any + * published TLFS version. When the bit is set, nested hypervisor can use + * 'updated' eVMCSv1 specification (perf_global_ctrl, s_cet, ssp, lbr_ctl, + * encls_exiting_bitmap, tsc_multiplier fields which were missing in 2016 + * specification). + */ +#define HV_X64_NESTED_EVMCS1_2022_UPDATE BIT(0)This bit is now defined[*], but the docs says it's only for perf_global_ctrl. Are we expecting an update to the TLFS? Indicates support for the GuestPerfGlobalCtrl and HostPerfGlobalCtrl fields in the enlightened VMCS. [*] https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/feature-discovery#hypervisor-nested-virtualization-features---0x4000000aOh well, better this than nothing. I'll ping the people who told me about this bit that their description is incomplete.Not that it changes anything, but I'd rather have no documentation. I'd much rather KVM say "this is the undocumented behavior" than "the document behavior is wrong".So I reached out to Microsoft and their answer was that for all these new eVMCS fields (including *PerfGlobalCtrl) observing architectural VMX MSRs should be enough. *PerfGlobalCtrl case is special because of Win11 bug (if we expose the feature in VMX feature MSRs but don't set CPUID.0x4000000A.EBX BIT(0) it just doesn't boot).I.e. TSC_SCALING shouldn't be gated on the flag? If so, then the 2-D array approach is overkill since (a) the CPUID flag only controls PERF_GLOBAL_CTRL and (b) we aren't expecting any more flags in the future.
Unfortunately, we have to gate the presence of these new features on something, otherwise VMM has no way to specify which particular eVMCS "revision" it wants (TL;DR: we will break migration). My initial implementation was inventing 'eVMCS revision' concept: https://lore.kernel.org/kvm/20220629150625.238286-7-vkuznets@redhat.com/ (local) which is needed if we don't gate all these new fields on CPUID.0x4000000A.EBX BIT(0). Going forward, we will still (likely) need something when new fields show up.
What about this for an implementation?
static bool evmcs_has_perf_global_ctrl(struct kvm_vcpu *vcpu)
{
struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
/*
* Filtering VMX controls for eVMCS compatibility should only be done
* for guest accesses, and all such accesses should be gated on Hyper-V
* being enabled and initialized.
*/
if (WARN_ON_ONCE(!hv_vcpu))
return false;
return hv_vcpu->cpuid_cache.nested_ebx & HV_X64_NESTED_EVMCS1_PERF_GLOBAL_CTRL;
}
static u32 evmcs_get_unsupported_ctls(struct kvm_vcpu *vcpu, u32 msr_index)
{
u32 unsupported_ctrls;
switch (msr_index) {
case MSR_IA32_VMX_EXIT_CTLS:
case MSR_IA32_VMX_TRUE_EXIT_CTLS:
unsupported_ctrls = EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
if (!evmcs_has_perf_global_ctrl(vcpu))
unsupported_ctrls |= VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
return unsupported_ctrls;
case MSR_IA32_VMX_ENTRY_CTLS:
case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
unsupported_ctrls = EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
if (!evmcs_has_perf_global_ctrl(vcpu))
unsupported_ctrls |= VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
return unsupported_ctrls;
case MSR_IA32_VMX_PROCBASED_CTLS2:
return EVMCS1_UNSUPPORTED_2NDEXEC;
case MSR_IA32_VMX_TRUE_PINBASED_CTLS:
case MSR_IA32_VMX_PINBASED_CTLS:
return EVMCS1_UNSUPPORTED_PINCTRL;
case MSR_IA32_VMX_VMFUNC:
return EVMCS1_UNSUPPORTED_VMFUNC;
default:
KVM_BUG_ON(1, vcpu->kvm);
return 0;
}
}
void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
{
u64 unsupported_ctrls = evmcs_get_unsupported_ctls(vcpu, msr_index);
if (msr_index == MSR_IA32_VMX_VMFUNC)
*pdata &= ~unsupported_ctrls;
else
*pdata &= ~(unsupported_ctrls << 32);
}It's smaller and I like it but it would only work in conjunction with KVM_CAP_HYPERV_ENLIGHTENED_VMCS2...
quoted
What I'm still concerned about is future proofing KVM for new features. When something is getting added to KVM for which no eVMCS field is currently defined, both Hyper-V-on-KVM and KVM-on-Hyper-V cases should be taken care of. It would probably be better to reverse our filtering, explicitly listing features supported in eVMCS. The lists are going to be fairly long but at least we won't have to take care of any new architectural feature added to KVM.Having the filtering be opt-in crossed my mind as well. Reversing the filtering can be done after this series though, correct?
Yes, that's my plan, Get this in to fix the immediate issue with 2022 features and probably reverse the filtering before Microsoft releases something else :-) -- Vitaly