Thread (99 messages) 99 messages, 15 authors, 2024-02-15

Re: [PATCH 1/2] x86/random: Retry on RDSEED failure

From: Daniel P. Berrangé <hidden>
Date: 2024-01-30 14:43:29
Also in: lkml

On Tue, Jan 30, 2024 at 03:06:14PM +0100, Jason A. Donenfeld wrote:
Is that an accurate summary? If it is, then the actual problem is that
the hardware provided to solve this problem doesn't actually solve it
that well, so we're caught deciding between guest-guest DoS (some
other guest on the system uses all RDRAND resources) and cryptographic
failure because of a malicious host creating a deterministic
environment.
In a CoCo VM environment, a guest DoS is not a unique threat
scenario, as it is unrelated to confidentiality. Ensuring
fair subdivision of resources between competeing guests is
just a general VM threat. There are many easy ways a host
admin can stop a guest making computational progress. Simply
not scheduling the guest vCPU threads is one. CoCo doesn't
try to solve this problem.

Preserving confidentiality is the primary aim of CoCo.

IOW, if the guest boot is stalled because the kernel is spinning
waiting on RDRAND to return data, that's fine. If the kernel
panics after "n" RDRAND failures in a row that's fine too. They
are both just yet another DoS scenario. 

If the kernel ignores the RDRAND failure and lets it boot with
degraded RNG state there were susceptible to attacks, that would
not be OK for CoCo. 
But I have two questions:

1) Is this CoCo VM stuff even real? Is protecting guests from hosts
actually possible in the end? Is anybody doing this? I assume they
are, so maybe ignore this question, but I would like to register my
gut feeling that on the Intel platform this seems like an endless
whack-a-mole problem like SGX.
It is real, but it is also not perfect. I expect it /will/ be an
endless whack-a-mole problem though.

None the less, it is a significant layer of defence, as compared
to traditional VMs where the guest RAM is nothing more than a
'cat' command away from host admin exposure.
2) Can a malicious host *actually* create a fully deterministic
environment? One that'll produce the same timing for the jitter
entropy creation, and all the other timers and interrupts and things?
I imagine the attestation part of CoCo means these VMs need to run on
real Intel silicon and so it can't be single stepped in TCG or
something, right? So is this problem actually a real one? And to what
degree? Any good experimental research on this?

Either way, if you're convinced RDRAND is the *only* way here, adding
a `WARN_ON(is_in_early_boot)` to the RDRAND (but not RDSEED) failure
path seems a fairly lightweight bandaid. I just wonder if the hardware
people could come up with something more reliable that we wouldn't
have to agonize over in the kernel.
If RDRAND failure is more of a theoretical problem than a practical
real world problem, I'd be inclined to just let the kernel loop on
RDRAND failure until it suceeds, with a WARN after 'n' iterations to
aid diagnosis of the stall in the unlikely even it did hit.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help