Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes
From: Peter Hilber <hidden>
Date: 2024-03-13 17:51:09
Also in:
linux-rtc, lkml, netdev, virtualization
On 13.03.24 15:06, David Woodhouse wrote:
On Wed, 2024-03-13 at 13:58 +0100, Alexandre Belloni wrote:quoted
The TSC or whatever CPU counter/clock that is used to keep the system time is not an RTC, I don't get why it has to be exposed as such to the guests. PTP is fine and precise, RTC is not.Ah, I see. But the point of the virtio_rtc is not really to expose that CPU counter. The point is to report the wallclock time, just like an actual RTC. The real difference is the *precision*. The virtio_rtc device has a facility to *also* expose the counter, because that's what we actually need to gain that precision... Applications don't read the RTC every time they want to know what the time is. These days, they don't even make a system call; it's done entirely in userspace mode. The kernel exposes some shared memory, essentially saying "the counter was X at time Y, and runs at Z Hz". Then applications just read the CPU counter and do some arithmetic. As we require more and more precision in the calibration, it becomes important to get *paired* readings of the CPU counter and the wallclock time at precisely the same moment. If the guest has to read one and then the other, potentially taking interrupts, getting preempted and suffering steal/SMI time in the middle, that introduces an error which is increasingly significant as we increasingly care about precision. Peter's proposal exposes the pairs of {X,Y} and leaves *all* the guest kernels having to repeat readings over time and perform the calibration as the underlying hardware oscillator frequency (Z) drifts with temperature. I'm trying to get him to let the hypervisor expose the calibrated frequency Z too. Along with *error* bounds for ±δX and ±δZ. Which aside from reducing the duplication of effort, will *also* fix the problem of live migration where *all* those things suffer a step change and leave the guest with an inaccurate clock but not knowing it.
I am already convinced that this would work significantly better than the
{X,Y} pair (but would be a bit more effort to implement):
1. when accessed by user space, obviously
2. when backing the PTP clock, it saves CPU time and makes non-paired
reads more precise.
I would just prefer to try upstreaming the {X,Y} pairing first. I think the
{X,Y,Z...} pairing could be discussed and developed in parallel.
Thanks for the comments,
Peter
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel