Thread (35 messages) 35 messages, 4 authors, 2021-05-25

Re: [PATCH 07/12] KVM: nVMX: Disable vmcs02 posted interrupts if vmcs12 PID isn't mappable

From: Sean Christopherson <seanjc@google.com>
Date: 2021-05-24 23:22:04

On Thu, May 20, 2021, Jim Mattson wrote:
quoted hunk ↗ jump to hunk
Don't allow posted interrupts to modify a stale posted interrupt
descriptor (including the initial value of 0).

Empirical tests on real hardware reveal that a posted interrupt
descriptor referencing an unbacked address has PCI bus error semantics
(reads as all 1's; writes are ignored). However, kvm can't distinguish
unbacked addresses from device-backed (MMIO) addresses, so it should
really ask userspace for an MMIO completion. That's overly
complicated, so just punt with KVM_INTERNAL_ERROR.

Don't return the error until the posted interrupt descriptor is
actually accessed. We don't want to break the existing kvm-unit-tests
that assume they can launch an L2 VM with a posted interrupt
descriptor that references MMIO space in L1.

Fixes: 6beb7bd52e48 ("kvm: nVMX: Refactor nested_get_vmcs12_pages()")
Signed-off-by: Jim Mattson <redacted>
---
 arch/x86/kvm/vmx/nested.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 706c31821362..defd42201bb4 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3175,6 +3175,15 @@ static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)
 				offset_in_page(vmcs12->posted_intr_desc_addr));
 			vmcs_write64(POSTED_INTR_DESC_ADDR,
 				     pfn_to_hpa(map->pfn) + offset_in_page(vmcs12->posted_intr_desc_addr));
+		} else {
+			/*
+			 * Defer the KVM_INTERNAL_ERROR exit until
+			 * someone tries to trigger posted interrupt
+			 * processing on this vCPU, to avoid breaking
+			 * existing kvm-unit-tests.
Run the lines out to 80 chars.  Also, can we change the comment to tie it to
CPU behavior in someway?  A few years down the road, "existing kvm-unit-tests"
may not have any relevant meaning, and it's not like kvm-unit-tests is bug free
either.  E.g. something like

			/*
			 * Defer the KVM_INTERNAL_ERROR exit until posted
			 * interrupt processing actually occurs on this vCPU.
			 * Until that happens, the descriptor is not accessed,
			 * and userspace can technically rely on that behavior.
			 */ 
quoted hunk ↗ jump to hunk
+			 */
+			vmx->nested.pi_desc = NULL;
+			pin_controls_clearbit(vmx, PIN_BASED_POSTED_INTR);
 		}
 	}
 	if (nested_vmx_prepare_msr_bitmap(vcpu, vmcs12))
@@ -3689,10 +3698,14 @@ static int vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
 	void *vapic_page;
 	u16 status;
 
-	if (!vmx->nested.pi_desc || !vmx->nested.pi_pending)
+	if (!vmx->nested.pi_pending)
 		return 0;
 
+	if (!vmx->nested.pi_desc)
+		goto mmio_needed;
+
 	vmx->nested.pi_pending = false;
+
 	if (!pi_test_and_clear_on(vmx->nested.pi_desc))
 		return 0;
 
-- 
2.31.1.818.g46aad6cb9e-goog
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help