Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
From: "Jason A. Donenfeld" <Jason@zx2c4.com>
Date: 2024-01-30 14:06:33
Also in:
lkml
On Tue, Jan 30, 2024 at 2:10 PM Reshetova, Elena [off-list ref] wrote:
The internals of Intel DRBG behind RDRAND/RDSEED has been publicly documented, so the structure is no secret. Please see [1] for overall structure and other aspects. So, yes, your overall understanding is correct (there are many more details though).
Indeed, have read it.
quoted
So maybe this patch #1 (of 2) can be dropped?Before we start debating this patchset, what is your opinion on the original problem we raised for CoCo VMs when both RDRAND/RDSEED are made to fail deliberately?
My general feeling is that this seems like a hardware problem. If you have a VM, the hypervisor should provide a seed. But with CoCo, you can't trust the host to do that. But can't the host do anything to the VM that it wants, like fiddle with its memory? No, there are special new hardware features to encrypt and protect ram to prevent this. So if you've found yourself in a situation where you absolutely cannot trust the host, AND the hardware already has working guest protections from the host, then it would seem you also need a hardware solution to handle seeding. And you're claiming that RDRAND/RDSEED is the *only* hardware solution available for it. Is that an accurate summary? If it is, then the actual problem is that the hardware provided to solve this problem doesn't actually solve it that well, so we're caught deciding between guest-guest DoS (some other guest on the system uses all RDRAND resources) and cryptographic failure because of a malicious host creating a deterministic environment. But I have two questions: 1) Is this CoCo VM stuff even real? Is protecting guests from hosts actually possible in the end? Is anybody doing this? I assume they are, so maybe ignore this question, but I would like to register my gut feeling that on the Intel platform this seems like an endless whack-a-mole problem like SGX. 2) Can a malicious host *actually* create a fully deterministic environment? One that'll produce the same timing for the jitter entropy creation, and all the other timers and interrupts and things? I imagine the attestation part of CoCo means these VMs need to run on real Intel silicon and so it can't be single stepped in TCG or something, right? So is this problem actually a real one? And to what degree? Any good experimental research on this? Either way, if you're convinced RDRAND is the *only* way here, adding a `WARN_ON(is_in_early_boot)` to the RDRAND (but not RDSEED) failure path seems a fairly lightweight bandaid. I just wonder if the hardware people could come up with something more reliable that we wouldn't have to agonize over in the kernel. Jason