Thread (83 messages) 83 messages, 11 authors, 2026-02-05

Re: [RFC PATCH v1 05/37] KVM: guest_memfd: Wire up kvm_get_memory_attributes() to per-gmem attributes

From: Sean Christopherson <seanjc@google.com>
Date: 2026-01-29 01:03:29
Also in: cgroups, kvm, linux-doc, linux-fsdevel, linux-kselftest, linux-mm, lkml

On Wed, Jan 28, 2026, Jason Gunthorpe wrote:
On Wed, Jan 28, 2026 at 01:47:50PM -0800, Ackerley Tng wrote:
quoted
Alexey Kardashevskiy [off-list ref] writes:
quoted
[...snip...]
Thanks for bringing this up!
quoted
I am trying to make it work with TEE-IO where fd of VFIO MMIO is a dmabuf
fd while the rest (guest RAM) is gmemfd. The above suggests that if there
is gmemfd - then the memory attributes are handled by gmemfd which is...
expected?
I think this is not expected.

IIUC MMIO guest physical addresses don't have an associated memslot, but
if you managed to get to that line in kvm_gmem_get_memory_attributes(),
then there is an associated memslot (slot != NULL)?
I think they should have a memslot, shouldn't they? I imagine creating
a memslot from a FD and the FD can be memfd, guestmemfd, dmabuf, etc,
etc ?
Yeah, there are two flavors of MMIO for KVM guests.  Emulated MMIO, which is
what Ackerley is thinking of, and "host" MMIO (for lack of a better term), which
is what I assume "fd of VFIO MMIO" is referring to.

Emulated MMIO does NOT have memslots[*].  There are some wrinkles and technical
exceptions, e.g. read-only memslots for emulating option ROMs, but by and large,
lack of a memslot means Emulated MMIO.

Host MMIO isn't something KVM really cares about, in the sense that, for the most
part, it's "just another memslot".  KVM x86 does need to identify host MMIO for
vendor specific reasons, e.g. to ensure UC memory stays UC when using EPT (MTRRs
are ignored), to create shared mappings when SME is enabled, and to mitigate the
lovely MMIO Stale Data vulnerability.

But those Host MMIO edge cases are almost entirely contained to make_spte() (see
the kvm_is_mmio_pfn() calls).  And so the vast, vast majority of "MMIO" code in
KVM is dealing with Emulated MMIO, and when most people talk about MMIO in KVM,
they're also talking about Emulated MMIO.
quoted
Either way, guest_memfd shouldn't store attributes for guest physical
addresses that don't belong to some guest_memfd memslot.

I think we need a broader discussion for this on where to store memory
attributes for MMIO addresses.

I think we should at least have line of sight to storing memory
attributes for MMIO addresses, in case we want to design something else,
since we're putting vm_memory_attributes on a deprecation path with this
series.
I don't know where you want to store them in KVM long term, but they
need to come from the dmabuf itself (probably via a struct
p2pdma_provider) and currently it is OK to assume all DMABUFs are
uncachable MMIO that is safe for the VM to convert into "write
combining" (eg Normal-NC on ARM)
+1.  For guest_memfd, we initially defined per-VM memory attributes to track
private vs. shared.  But as Ackerley noted, we are in the process of deprecating
that support, e.g. by making it incompatible with various guest_memfd features,
in favor of having each guest_memfd instance track the state of a given page.

The original guest_memfd design was that it would _only_ hold private pages, and
so tracking private vs. shared in guest_memfd didn't make any sense.  As we've
pivoted to in-place conversion, tracking private vs. shared in the guest_memfd
has basically become mandatory.  We could maaaaaybe make it work with per-VM
attributes, but it would be insanely complex.

For a dmabuf fd, the story is the same as guest_memfd.  Unless private vs. shared
is all or nothing, and can never change, then the only entity that can track that
info is the owner of the dmabuf.  And even if the private vs. shared attributes
are constant, tracking it external to KVM makes sense, because then the provider
can simply hardcode %true/%false.

As for _how_ to do that, no matter where the attributes are stored, we're going
to have to teach KVM to play nice with a non-guest_memfd provider of private
memory.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help