Thread (66 messages) 66 messages, 9 authors, 2024-07-06

Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

From: Peter Hilber <hidden>
Date: 2024-03-13 17:51:09
Also in: linux-rtc, lkml, netdev, virtualization

On 13.03.24 15:06, David Woodhouse wrote:
On Wed, 2024-03-13 at 13:58 +0100, Alexandre Belloni wrote:
quoted
The TSC or whatever CPU counter/clock that is used to keep the system
time is not an RTC, I don't get why it has to be exposed as such to the
guests. PTP is fine and precise, RTC is not.
Ah, I see. But the point of the virtio_rtc is not really to expose that
CPU counter. The point is to report the wallclock time, just like an
actual RTC. The real difference is the *precision*.

The virtio_rtc device has a facility to *also* expose the counter,
because that's what we actually need to gain that precision...

Applications don't read the RTC every time they want to know what the
time is. These days, they don't even make a system call; it's done
entirely in userspace mode. The kernel exposes some shared memory,
essentially saying "the counter was X at time Y, and runs at Z Hz".
Then applications just read the CPU counter and do some arithmetic.

As we require more and more precision in the calibration, it becomes
important to get *paired* readings of the CPU counter and the wallclock
time at precisely the same moment. If the guest has to read one and
then the other, potentially taking interrupts, getting preempted and
suffering steal/SMI time in the middle, that introduces an error which
is increasingly significant as we increasingly care about precision.

Peter's proposal exposes the pairs of {X,Y} and leaves *all* the guest
kernels having to repeat readings over time and perform the calibration
as the underlying hardware oscillator frequency (Z) drifts with
temperature. I'm trying to get him to let the hypervisor expose the
calibrated frequency Z too. Along with *error* bounds for ±δX and ±δZ.
Which aside from reducing the duplication of effort, will *also* fix
the problem of live migration where *all* those things suffer a step
change and leave the guest with an inaccurate clock but not knowing it.
I am already convinced that this would work significantly better than the
{X,Y} pair (but would be a bit more effort to implement):

1. when accessed by user space, obviously

2. when backing the PTP clock, it saves CPU time and makes non-paired
   reads more precise.

I would just prefer to try upstreaming the {X,Y} pairing first. I think the
{X,Y,Z...} pairing could be discussed and developed in parallel.

Thanks for the comments,

Peter

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help