Thread (97 messages) 97 messages, 6 authors, 2025-11-19

Re: [PATCH v3 24/25] KVM: TDX: Guard VM state transitions with "all" the locks

From: Sean Christopherson <seanjc@google.com>
Date: 2025-10-27 18:10:52
Also in: kvm, kvm-riscv, kvmarm, linux-coco, linux-mips, linux-riscv, linuxppc-dev, lkml, loongarch

On Mon, Oct 27, 2025, Rick P Edgecombe wrote:
On Mon, 2025-10-27 at 17:26 +0800, Yan Zhao wrote:
quoted
quoted
Ugh, I'd rather not?  Refresh me, what's the story with "v1"?  Are we now on
v2?
No... We are now on v1.
As in [1], I found that TDX module changed SEAMCALL TDH_VP_INIT behavior to
require exclusive lock on resource TDR when leaf_opcode.version > 0.

Therefore, we moved KVM_TDX_INIT_VCPU to tdx_vcpu_unlocked_ioctl() in patch
22.

[1] https://lore.kernel.org/all/aLa34QCJCXGLk%2Ffl@yzhao56-desk.sh.intel.com/ (local)
Looking at the PDF docs, TDR exclusive locking in version == 1 is called out at
least back to the oldest ABI docs I have (March 2024). Not sure about the
assertion that the behavior changed, but if indeed this was documented, it's a
little bit our bad. We might consider being flexible around calling it a TDX ABI
break?

Sean, can you elaborate why taking mmu_lock is objectionable here, though?
It's not, I was just hoping we could avoid yet more complexity.

Assuming we do indeed need to take mmu_lock, can you send a patch that applies
on top?  I'm not planning on sending any of this to stable@, so I don't see any
reason to try and juggle patches around.
Note, myself (and I think Yan?) determined the locking by examining TDX module
source. For myself, it's possible I misread the locking originally.

Also, I'm not sure about switching gears at this point, but it makes me wonder
about the previously discussed option of trying to just duplicate the TDX locks
on the kernel side.
Please no.  At best that will yield a pile of effectively useless code.  At worst,
it will make us lazy and lead to real bugs because we don't propery guard the *KVM*
flows that need exclusivity relative to what is going on in the TDX-Module.
Or perhaps make some kind of debug time lockdep type thing to document/check
the assumptions in the kernel. Something along the lines of this patch, but
to map the TDX locks to KVM locks or something. As we add more things (DPAMT,
etc), it doesn't seem like the TDX locking is getting tamer...
Hmm, I like the idea, but actually getting meaningful coverage could be quite
difficult.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help