Re: Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)
From: Andy Lutomirski <luto@kernel.org>
Date: 2020-06-23 18:27:09
Also in:
kvm, lkml
On Tue, Jun 23, 2020 at 8:23 AM Andrew Cooper [off-list ref] wrote:
On 23/06/2020 14:03, Peter Zijlstra wrote:quoted
On Tue, Jun 23, 2020 at 02:12:37PM +0200, Joerg Roedel wrote:quoted
On Tue, Jun 23, 2020 at 01:50:14PM +0200, Peter Zijlstra wrote:quoted
If SNP is the sole reason #VC needs to be IST, then I'd strongly urge you to only make it IST if/when you try and make SNP happen, not before.It is not the only reason, when ES guests gain debug register support then #VC also needs to be IST, because #DB can be promoted into #VC then, and as #DB is IST for a reason, #VC needs to be too.Didn't I read somewhere that that is only so for Rome/Naples but not for the later chips (Milan) which have #DB pass-through?I don't know about hardware timelines, but some future part can now opt in to having debug registers as part of the encrypted state, and swapped by VMExit, which would make debug facilities generally usable, and supposedly safe to the #DB infinite loop issues, at which point the hypervisor need not intercept #DB for safety reasons. Its worth nothing that on current parts, the hypervisor can set up debug facilities on behalf of the guest (or behind its back) as the DR state is unencrypted, but that attempting to intercept #DB will redirect to #VC inside the guest and cause fun. (Also spare a thought for 32bit kernels which have to cope with userspace singlestepping the SYSENTER path with every #DB turning into #VC.)
What do you mean 32-bit? 64-bit kernels have exactly the same problem. At least the stack is okay, though. Anyway, since I'm way behind on this thread, here are some thoughts: First, I plan to implement actual precise recursion detection for the IST stacks. We'll be able to reliably panic when unallowed recursion happens. Second, I don't object *that* strongly to switching to a second #VC stack if an NMI or MCE happens, but we really need to make sure we cover *all* the bases. And #VC is distressingly close to "happens at all kinds of unfortunate times and the guest doesn't actually have much ability to predice it" right now. So we have #VC + #DB + #VC, #VC + NMI + #VC, #VC + MCE + #VC, and even worse options. So doing the shift in a reliable way is not necessarily possible in a clean way. Let me contemplate. And maybe produce some code soon.