Re: [RFC PATCH V2 11/11] x86: tsc: avoid system instability in hibernation
From: "Rafael J. Wysocki" <rafael@kernel.org>
Date: 2020-01-13 11:48:21
Also in:
linux-mm, linux-pm, lkml, xen-devel
On Mon, Jan 13, 2020 at 12:43 PM Singh, Balbir [off-list ref] wrote:
On Mon, 2020-01-13 at 11:16 +0100, Peter Zijlstra wrote:quoted
On Fri, Jan 10, 2020 at 07:35:20AM -0800, Eduardo Valentin wrote:quoted
Hey Peter, On Wed, Jan 08, 2020 at 11:50:11AM +0100, Peter Zijlstra wrote:quoted
On Tue, Jan 07, 2020 at 11:45:26PM +0000, Anchal Agarwal wrote:quoted
From: Eduardo Valentin <redacted> System instability are seen during resume from hibernation when system is under heavy CPU load. This is due to the lack of update of sched clock data, and the scheduler would then think that heavy CPU hog tasks need more time in CPU, causing the system to freeze during the unfreezing of tasks. For example, threaded irqs, and kernel processes servicing network interface may be delayed for several tens of seconds, causing the system to be unreachable. The fix for this situation is to mark the sched clock as unstable as early as possible in the resume path, leaving it unstable for the duration of the resume process. This will force the scheduler to attempt to align the sched clock across CPUs using the delta with time of day, updating sched clock data. In a post hibernation event, we can then mark the sched clock as stable again, avoiding unnecessary syncs with time of day on systems in which TSC is reliable.This makes no frigging sense what so bloody ever. If the clock is stable, we don't care about sched_clock_data. When it is stable you get a linear function of the TSC without complicated bits on. When it is unstable, only then do we care about the sched_clock_data.Yeah, maybe what is not clear here is that we covering for situation where clock stability changes over time, e.g. at regular boot clock is stable, hibernation happens, then restore happens in a non-stable clock.Still confused, who marks the thing unstable? The patch seems to suggest you do yourself, but it is not at all clear why. If TSC really is unstable, then it needs to remain unstable. If the TSC really is stable then there is no point in marking is unstable. Either way something is off, and you're not telling me what.Hi, Peter For your original comment, just wanted to clarify the following: 1. After hibernation, the machine can be resumed on a different but compatible host (these are VM images hibernated) 2. This means the clock between host1 and host2 can/will be different
So the problem is specific to this particular use case. I'm not sure why to impose this hack on hibernation in all cases.