Re: [PATCH 09/13] KVM: arm64: Add clock for hyp tracefs
From: Vincent Donnefort <hidden>
Date: 2024-09-16 12:39:17
Also in:
kvmarm, lkml
On Fri, Sep 13, 2024 at 04:21:05PM -0700, 'John Stultz' via kernel-team wrote:
On Wed, Sep 11, 2024 at 2:31 AM Vincent Donnefort [off-list ref] wrote:quoted
Configure the hypervisor tracing clock before starting tracing. For tracing purpose, the boot clock is interesting as it doesn't stop on suspend. However, it is corrected on a regular basis, which implies we need to re-evaluate it every once in a while. Cc: John Stultz <jstultz@google.com> Cc: Thomas Gleixner <redacted> Cc: Stephen Boyd <sboyd@kernel.org> Cc: Christopher S. Hall <redacted> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Lakshmi Sowjanya D <redacted> Signed-off-by: Vincent Donnefort <redacted>...quoted
+static void __hyp_clock_work(struct work_struct *work) +{ + struct delayed_work *dwork = to_delayed_work(work); + struct hyp_trace_buffer *hyp_buffer; + struct hyp_trace_clock *hyp_clock; + struct system_time_snapshot snap; + u64 rate, delta_cycles; + u64 boot, delta_boot; + u64 err = 0; + + hyp_clock = container_of(dwork, struct hyp_trace_clock, work); + hyp_buffer = container_of(hyp_clock, struct hyp_trace_buffer, clock); + + ktime_get_snapshot(&snap); + boot = ktime_to_ns(snap.boot); + + delta_boot = boot - hyp_clock->boot; + delta_cycles = snap.cycles - hyp_clock->cycles; + + /* Compare hyp clock with the kernel boot clock */ + if (hyp_clock->mult) { + u64 cur = delta_cycles; + + cur *= hyp_clock->mult;Mult overflow protection (I see you already have a max_delta value) is probably needed here.
That should never happen really with the max_delta. But I could add a WARN_ON and fallback to a 128-bits compute instead here too?
quoted
+ cur >>= hyp_clock->shift; + cur += hyp_clock->boot; + + err = abs_diff(cur, boot); + + /* No deviation, only update epoch if necessary */ + if (!err) { + if (delta_cycles >= hyp_clock->max_delta) + goto update_hyp; + + goto resched; + } + + /* Warn if the error is above tracing precision (1us) */ + if (hyp_buffer->tracing_on && err > NSEC_PER_USEC) + pr_warn_ratelimited("hyp trace clock off by %lluus\n", + err / NSEC_PER_USEC);I'm curious in practice, does this come up often? If so, does it converge down nicely? Have you done much disruption testing using adjtimex?
So far, I haven't seen any error above ~100 ns on the machine I have tested with, but that's a good point, I'll check how it looks when the boot clock is less stable.
quoted
+ } + + if (delta_boot > U32_MAX) { + do_div(delta_boot, NSEC_PER_SEC); + rate = delta_cycles; + } else { + rate = delta_cycles * NSEC_PER_SEC; + } + + do_div(rate, delta_boot); + + clocks_calc_mult_shift(&hyp_clock->mult, &hyp_clock->shift, + rate, NSEC_PER_SEC, CLOCK_MAX_CONVERSION_S); + +update_hyp: + hyp_clock->max_delta = (U64_MAX / hyp_clock->mult) >> 1; + hyp_clock->cycles = snap.cycles; + hyp_clock->boot = boot; + kvm_call_hyp_nvhe(__pkvm_update_clock_tracing, hyp_clock->mult, + hyp_clock->shift, hyp_clock->boot, hyp_clock->cycles); + complete(&hyp_clock->ready);I'm very forgetful, so maybe it's unnecessary, but for future-you or just other's like me, it might be worth adding some extra comments to clarify the assumptions in these calculations.
Ack.
thanks -john
Thanks for your time! -- Vincent