Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM
From: Sean Christopherson <seanjc@google.com>
Date: 2023-02-22 21:53:24
Also in:
kvm, linux-arch, linux-doc, linux-fsdevel, linux-mm, lkml, qemu-devel
On Thu, Feb 16, 2023, David Hildenbrand wrote:
On 16.02.23 06:13, Mike Rapoport wrote:quoted
Hi, On Fri, Dec 02, 2022 at 02:13:38PM +0800, Chao Peng wrote:quoted
This patch series implements KVM guest private memory for confidential computing scenarios like Intel TDX[1]. If a TDX host accesses TDX-protected guest memory, machine check can happen which can further crash the running host system, this is terrible for multi-tenant configurations. The host accesses include those from KVM userspace like QEMU. This series addresses KVM userspace induced crash by introducing new mm and KVM interfaces so KVM userspace can still manage guest memory via a fd-based approach, but it can never access the guest memory content.Sorry for jumping late. Unless I'm missing something, hibernation will also cause an machine check when there is TDX-protected memory in the system. When the hibernation creates memory snapshot it essentially walks all physical pages and saves their contents, so for TDX memory this will trigger machine check, right?
For hibernation specifically, I think that should be handled elsewhere as hibernation is simply incompatible with TDX, SNP, pKVM, etc. without paravirtualizing the guest, as none of those technologies support auto-export a la s390. I suspect the right approach is to disallow hibernation if KVM is running any protected guests.
I recall bringing that up in the past (also memory access due to kdump, /prov/kcore) and was told that the main focus for now is preventing unprivileged users from crashing the system, that is, not mapping such memory into user space (e.g., QEMU). In the long run, we'll want to handle such pages also properly in the other events where the kernel might access them.
Ya, unless someone strongly objects, the plan is to essentially treat "attacks" from privileged users as out of to scope for initial support, and then iterate as needed to fix/enable more features. FWIW, read accesses, e.g. kdump, should be ok for TDX and SNP as they both play nice with "bad" reads. pKVM is a different beast though as I believe any access to guest private memory will fault. But my understanding is that this series would be a big step forward for pKVM, which currently doesn't have any safeguards.