Thread (7 messages) 7 messages, 4 authors, 2023-11-01

Re: [PATCH v2] KVM x86/xen: add an override for PVCLOCK_TSC_STABLE_BIT

From: David Woodhouse <dwmw2@infradead.org>
Date: 2023-10-31 23:07:29
Also in: kvm, lkml

On Tue, 2023-10-31 at 22:58 +0000, Sean Christopherson wrote:
On Tue, Oct 31, 2023, David Woodhouse wrote:
quoted
On Tue, 2023-10-31 at 15:39 -0700, Sean Christopherson wrote:
quoted
On Tue, Oct 31, 2023, Paul Durrant wrote:
Any reason not to make this a generic "capability" instead of a Xen specific flag?
E.g. I assume these problematic guests would mishandle PVCLOCK_TSC_STABLE_BIT if
it showed up in kvmclock, but they don't use kvmclock so it's not a problem in
practice.
No, those guests are just fine with kvmclock. It's the *Xen* page they
forgot to map to userspace for the vDSO to use. And it's Xen (true Xen)
which made you jump through hoops to use the TSC that way, such that it
would actually expose the PVCLOCK_TSC_STABLE_BIT. We don't expect, and
have never seen, such issues with native KVM guests.
Hmm, and I suppose theoretically the guest kernel could choose to ignore the Xen
interface for whatever reason.  Mostly out of curiosity, is this flag something
that'd be set anytime Xen is advertised to the guest?
Probably not in QEMU; I'll make it optional there.

Hosting providers who are migrating millions of Xen guests to KVM and
want to do so with as little customer pain as possible, and who have
already had customer failures due to this guest kernel bug... are more
likely to turn it on for all "Xen" guests.
quoted
quoted
I doubt there's a real need or use case, but it'd require less churn and IMO is
simpler overall, e.g.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e3eb608b6692..731b201bfd5a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3225,7 +3225,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
 
        /* If the host uses TSC clocksource, then it is stable */
        pvclock_flags = 0;
-       if (use_master_clock)
+       if (use_master_clock && !vcpu->kvm.force_tsc_unstable)
                pvclock_flags |= PVCLOCK_TSC_STABLE_BIT;
 
        vcpu->hv_clock.flags = pvclock_flags;

I also assume this is a "set and forget" thing?  I.e. KVM can require the flag
to be set before any vCPUs are created.
Hrm, not sure we have previously required that the KVM_XEN_HVM_CONFIG
setup be done before any vCPUs were created.
Oh, I was asking in the context of adding a generic capability.
Yeah, it's saner for it to be set-and-forget. We *could* contrive some
kind of detection for the affected guest kernels and turn it off just
for them... but no, I just don't want to.
quoted
I tend to prefer *not* to push ordering requirements onto userspace.
For per-VM flags that are consumed by vCPUs, it makes reasoning about correctness
and what is/isn't allowed much, much easier.
quoted
Does it need to be a per-vcpu thing? 
Huh?  No, I was only asking (again, for a generic capability) if we could do

                mutex_lock(&kvm->lock);
                if (!kvm->created_vcpus) {
                        kvm->arch.force_tsc_unstable = true;
                        r = 0;
                }
                mutex_unlock(&kvm->lock);

So that it would be blatantly obvious that there's no race with checking a per-VM
flag without any lock/RCU protections.
Makes sense. Although TBH if the VMM wants to flip this bit on and off
at runtime while the guest clocks are being updated, it deserves what
it gets. It's not a problem for KVM.

Attachments

  • smime.p7s [application/pkcs7-signature] 5965 bytes
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help