Thread (47 messages) 47 messages, 5 authors, 2022-01-04

RE: [PATCH v3 22/22] kvm: x86: Disable interception for IA32_XFD on demand

From: "Tian, Kevin" <kevin.tian@intel.com>
Date: 2021-12-31 09:43:06
Also in: kvm, linux-kselftest, lkml

From: Tian, Kevin
Sent: Thursday, December 30, 2021 3:05 PM

the new change is like below.

static void handle_nm_fault_irqoff(struct kvm_vcpu *vcpu)
 {
	/*
	 * Save xfd_err to guest_fpu before interrupt is enabled, so the
	 * guest value is not clobbered by the host activity before the guest
	 * has chance to consume it.
	 *
	 * Since trapping #NM is started when xfd write interception is
	 * disabled, using this flag to guard the saving operation. This
	 * implies no-op for a non-xfd #NM due to L1 interception.
	 *
	 * Queuing exception is done in vmx_handle_exit.
	 */
	if (vcpu->arch.xfd_no_write_intercept)
		rdmsrl(MSR_IA32_XFD_ERR, vcpu->arch.guest_fpu.xfd_err);
}

in the final series it will first check vcpu->arch.guest_fpu.fpstate->xfd
before the disable interception patch is applied and then becomes
the above form, similar to your suggestion on
vmx_update_exception_bitmap().

whether to check msr_bitmap vs. an extra flag is an orthogonal open.

Then:

handle_exception_nmi(struct kvm_vcpu *vcpu)
{
	...
	if (is_machine_check(intr_info) || is_nmi(intr_info))
		return 1; /* handled by handle_exception_nmi_irqoff() */

	/*
	 * Queue the exception here instead of in handle_nm_fault_irqoff().
	 * This ensures the nested_vmx check is not skipped so vmexit can
	 * be reflected to L1 (when it intercepts #NM) before reaching this
	 * point.
	 */
	if (is_nm_fault(intr_info)) {
		kvm_queue_exception(vcpu, NM_VECTOR);
		return 1;
	}

	...
}

Then regarding to test non-AMX nested #NM usage, it might be difficult
to trigger it from modern OS. As commented by Linux #NM handler, it's
expected only for XFD or math emulation when fpu is missing. So we plan
to run a selftest in L1 which sets CR0.TS and then touch fpu registers. and
for L1 kernel we will run two binaries with one trapping #NM and the other
not.
We have verified this scenario and didn't find problem.

Basically the selftest is like below:

	guest_code()
	{
		cr0 = read_cr0();
		cr0 |= X86_CR0_TS;
		write_cr0(cr0);

		asm volatile("fnop");
	}

	guest_nm_handler()
	{
		cr0 = read_cr0();
		cr0 &= ~X86_CR0_TS;
		write_cr0(cr0);
	}

We run the selftest in L1 to create a nested scenario.

When L1 intercepts #NM:

	(L2) fnop
	(L0) #NM vmexit
	(L0) reflect a virtual vmexit (reason #NM) to L1
	(L1) #NM vmexit
	(L1) queue #NM exception to L2
	(L2) guest_nm_handler()
	(L2) fnop (succeed)

When L1 doesn't intercept #NM:
	(L2) fnop
	(L0) #NM vmexit
	(L0) queue #NM exception to L2
	(L2) guest_nm_handler()
	(L2) fnop (succeed)

Please suggest if any more test is necessary here.

Thanks
Kevin
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help