Re: [PATCH v4 3/6] x86/sev-es: Split up runtime #VC handler for correct... | kvm

Re: [PATCH v4 3/6] x86/sev-es: Split up runtime #VC handler for correct state tracking

From: Peter Zijlstra <peterz@infradead.org>
Date: 2021-06-10 10:21:15
Also in: linux-coco, lkml, virtualization

Bah, I suppose the trouble is that this SEV crap requires PARAVIRT?

I should really get around to fixing noinstr validation with PARAVIRT on
:-(

On Thu, Jun 10, 2021 at 11:11:38AM +0200, Joerg Roedel wrote:

+static void vc_handle_from_kernel(struct pt_regs *regs, unsigned long error_code)

static noinstr ...

quoted hunk ↗ jump to hunk

+{
+	irqentry_state_t irq_state = irqentry_nmi_enter(regs);
 
+	instrumentation_begin();
 
+	if (!vc_raw_handle_exception(regs, error_code)) {
 		/* Show some debug info */
 		show_regs(regs);

@@ -1434,7 +1400,59 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
 		panic("Returned from Terminate-Request to Hypervisor\n");
 	}
 
+	instrumentation_end();
+	irqentry_nmi_exit(regs, irq_state);
+}
+
+static void vc_handle_from_user(struct pt_regs *regs, unsigned long error_code)

static noinstr ...

+{
+	irqentry_state_t irq_state = irqentry_enter(regs);
+
+	instrumentation_begin();
+
+	if (!vc_raw_handle_exception(regs, error_code)) {
+		/*
+		 * Do not kill the machine if user-space triggered the
+		 * exception. Send SIGBUS instead and let user-space deal with
+		 * it.
+		 */
+		force_sig_fault(SIGBUS, BUS_OBJERR, (void __user *)0);
+	}
+
+	instrumentation_end();
+	irqentry_exit(regs, irq_state);
+}

+ linebreak

+/*
+ * Main #VC exception handler. It is called when the entry code was able to
+ * switch off the IST to a safe kernel stack.
+ *
+ * With the current implementation it is always possible to switch to a safe
+ * stack because #VC exceptions only happen at known places, like intercepted
+ * instructions or accesses to MMIO areas/IO ports. They can also happen with
+ * code instrumentation when the hypervisor intercepts #DB, but the critical
+ * paths are forbidden to be instrumented, so #DB exceptions currently also
+ * only happen in safe places.
+ */
+DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
+{
+	/*
+	 * Handle #DB before calling into !noinstr code to avoid recursive #DB.
+	 */
+	if (error_code == SVM_EXIT_EXCP_BASE + X86_TRAP_DB) {
+		vc_handle_trap_db(regs);
+		return;
+	}
+
+	/*
+	 * This is invoked through an interrupt gate, so IRQs are disabled. The
+	 * code below might walk page-tables for user or kernel addresses, so
+	 * keep the IRQs disabled to protect us against concurrent TLB flushes.
+	 */
+
+	if (user_mode(regs))
+		vc_handle_from_user(regs, error_code);
+	else
+		vc_handle_from_kernel(regs, error_code);
 }

#DB and MCE use idtentry_mce_db and split out in asm. When I look at
idtentry_vc, it appears to me that VC_SAFE_STACK already implies
from-user, or am I reading that wrong?

Ah, it appears you're muddling things up again by then also calling
safe_stack_ from exc_.

How about you don't do that and have exc_ call your new from_kernel
function, then we know that safe_stack_ is always from-user. Then also
maybe do:

	s/VS_SAFE_STACK/VC_USER/
	s/safe_stack_/noist_/

to match all the others (#DB/MCE).

Also, AFAICT, you don't actually need DEFINE_IDTENTRY_VC_IST, it doesn't
have an ASM counterpart.

So then you end up with something like:

DEFINE_IDTENTRY_VC(exc_vc)
{
	if (unlikely(on_vc_fallback_stack(regs))) {
		instrumentation_begin();
		panic("boohooo\n");
		instrumentation_end();
	}

	vc_from_kernel();
}

DEFINE_IDTENTRY_VC_USER(exc_vc)
{
	vc_from_user();
}

Which is, I'm thinking, much simpler, no?

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help