Thread (22 messages) 22 messages, 3 authors, 2021-12-03

Re: [PATCH v2 5/8] KVM: x86: Don't bother reading sparse banks that end up being ignored

From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: 2021-11-01 09:51:48
Also in: kvm, linux-arch, lkml

Sean Christopherson [off-list ref] writes:
quoted hunk ↗ jump to hunk
When handling "sparse" VP_SET requests, don't read sparse banks that
can't possibly contain a legal VP index instead of ignoring such banks
later on in sparse_set_to_vcpu_mask().  This allows KVM to cap the size
of its sparse_banks arrays for VP_SET at KVM_HV_MAX_SPARSE_VCPU_SET_BITS.

Reducing the size of sparse_banks fudges around a compilation warning
(that becomes error with KVM_WERROR=y) when CONFIG_KASAN_STACK=y, which
is selected (and can't be unselected) by CONFIG_KASAN=y when using gcc
(clang/LLVM is a stack hog in some cases so it's opt-in for clang).
KASAN_STACK adds a redzone around every stack variable, which pushes the
Hyper-V functions over the default limit of 1024.

Ideally, KVM would flat out reject such impossibilities, but the TLFS
explicitly allows providing empty banks, even if a bank can't possibly
contain a valid VP index due to its position exceeding KVM's max.

  Furthermore, for a bit 1 in ValidBankMask, it is valid state for the
  corresponding element in BanksContents can be all 0s, meaning no
  processors are specified in this bank.

Arguably KVM should reject and not ignore the "extra" banks, but that can
be done independently and without bloating sparse_banks, e.g. by reading
each "extra" 8-byte chunk individually.

Reported-by: Ajay Garg <redacted>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/hyperv.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 3d0981163eed..8832727d74d9 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1753,11 +1753,16 @@ struct kvm_hv_hcall {
 static u64 kvm_get_sparse_vp_set(struct kvm *kvm, struct kvm_hv_hcall *hc,
 				 u64 *sparse_banks, gpa_t offset)
 {
+	u16 var_cnt;
+
 	if (hc->var_cnt > 64)
 		return -EINVAL;
 
+	/* Ignore banks that cannot possibly contain a legal VP index. */
+	var_cnt = min_t(u16, hc->var_cnt, KVM_HV_MAX_SPARSE_VCPU_SET_BITS);
+
One may wonder why we're mixing up VP indices and VCPU ids (caped by
KVM_MAX_VCPUS) here as these don't have to match. The following commit
sheds some light:

commit 9170200ec0ebad70e5b9902bc93e2b1b11456a3b
Author: Vitaly Kuznetsov [off-list ref]
Date:   Wed Aug 22 12:18:28 2018 +0200

    KVM: x86: hyperv: enforce vp_index < KVM_MAX_VCPUS
    
    Hyper-V TLFS (5.0b) states:
    
    > Virtual processors are identified by using an index (VP index). The
    > maximum number of virtual processors per partition supported by the
    > current implementation of the hypervisor can be obtained through CPUID
    > leaf 0x40000005. A virtual processor index must be less than the
    > maximum number of virtual processors per partition.
    
    Forbid userspace to set VP_INDEX above KVM_MAX_VCPUS. get_vcpu_by_vpidx()
    can now be optimized to bail early when supplied vpidx is >= KVM_MAX_VCPUS.
quoted hunk ↗ jump to hunk
 	return kvm_read_guest(kvm, hc->ingpa + offset, sparse_banks,
-			      hc->var_cnt * sizeof(*sparse_banks));
+			      var_cnt * sizeof(*sparse_banks));
 }
 
 static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool ex)
@@ -1770,7 +1775,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool
 	DECLARE_BITMAP(vcpu_bitmap, KVM_MAX_VCPUS);
 	unsigned long *vcpu_mask;
 	u64 valid_bank_mask;
-	u64 sparse_banks[64];
+	u64 sparse_banks[KVM_HV_MAX_SPARSE_VCPU_SET_BITS];
 	bool all_cpus;
 
 	if (!ex) {
@@ -1894,7 +1899,7 @@ static u64 kvm_hv_send_ipi(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool
 	DECLARE_BITMAP(vcpu_bitmap, KVM_MAX_VCPUS);
 	unsigned long *vcpu_mask;
 	unsigned long valid_bank_mask;
-	u64 sparse_banks[64];
+	u64 sparse_banks[KVM_HV_MAX_SPARSE_VCPU_SET_BITS];
 	u32 vector;
 	bool all_cpus;
Saves the day until KVM_MAX_VCPUS goes above 4096 (and when it does the
problem strikes back even worse as KVM_HV_MAX_SPARSE_VCPU_SET_BITS is
not caped at '64'). As we're good for now,

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

(I'd even suggest we add BUILD_BUG_ON(KVM_HV_MAX_SPARSE_VCPU_SET_BITS > 64))

Going forward, we can probably get rid of thes on-stack allocations
completely by either allocating these 512 bytes dynamically (lazily)
upon first usage or just adding a field to 'struct kvm_vcpu_hv' -- which
is being allcated dynamically nowadays so non-Windows guests won't suffer.

-- 
Vitaly
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help