Re: [PATCH] x86/tdx: Generate SIGBUS on userspace MMIO
From: Kirill A. Shutemov <hidden>
Date: 2024-06-10 14:31:18
Also in:
lkml
On Mon, Jun 10, 2024 at 06:55:56AM -0700, Dave Hansen wrote:
On 5/28/24 03:09, Kirill A. Shutemov wrote:quoted
Currently, attempting to perform MMIO from userspace in a TDX guest leads to a warning about an unexpected #VE and SIGSEGV being delivered to the process.Does it _always_ result in a #VE? Or is this only when guests mmap() something like from a driver and the host doesn't back the shared memory?
See below.
quoted
Enlightened userspace may choose to handle MMIO on their own if the kernel does not emulate it. Handle the EPT_VIOLATION exit reason for userspace and deliver SIGBUS instead of SIGSEGV. SIGBUS is more appropriate for the MMIO situation.Is any userspace _actually_ doing this? Sure, SIGBUS is more appropriate but in practice unprepared userspace crashes either way.
Microsoft folks have plans to do this. I don't know if any current code handles SIGBUS this way.
quoted
@@ -641,17 +647,20 @@ static int virt_exception_user(struct pt_regs *regs, struct ve_info *ve) switch (ve->exit_reason) { case EXIT_REASON_CPUID: return handle_cpuid(regs, ve); + case EXIT_REASON_EPT_VIOLATION: + if (is_private_gpa(ve->gpa)) + panic("Unexpected EPT-violation on private memory."); + + force_sig_fault(SIGBUS, BUS_ADRERR, (void __user *)ve->gla); + + /* Return 0 to avoid incrementing RIP */ + return 0;This _really_ needs a comment, probably even a helper function where you can actually explain what is going on. I could barely remember what this is for today. There's no hope for me in a couple of years. Just thinking through the possibilities here: Private=> Private : no #VE Private=> Anything else : fatal shutdown Shared => Shared : no #VE Shared => Private : #VE (end up here) Shared => !Present : #VE (end up here)
Are you talking about page state vs. page mapping here? It is wrong frame to think about it. We get here as result of EPT violation. Either shared or secret EPT. So we don't have an present EPT entry for the page or allowed permissions doesn't match the access. The is_private_gpa() check catches cases when private EPT doesn't have a valid entry for page accessed via private mapping: page is not accepted or removed by VMM. This case is only reachable for debug-TD. In for non-debug TD it leads to unrecoverable TD exit. The same story as for kernel addresses. Normal shared memory doesn't cause #VE even if the memory was not converted to shared explicitly. On the first access to a page via shared mapping VMM will allocate a new page and fill EPT[*]. Except for GPA ranges dedicated for MMIO. For these VMM will not fill EPT and it causes #VE which interpreted as MMIO access. Note that only emulated devices require such mechanism to get MMIO working. For device passthrough, device MMIO range mapped directly into the guest and handled transparently for the guest kernel. Does it make sense? [*] I don't like this implicit conversion to shared. I would prefer #VE here too, but it is what we have. -- Kiryl Shutsemau / Kirill A. Shutemov