Re: + stupid-hack-to-make-mainline-build.patch added to -mm tree
From: Thomas Gleixner <hidden>
Date: 2007-03-07 01:16:47
Also in:
lkml
On Tue, 2007-03-06 at 16:42 -0800, Dan Hecht wrote:
quoted
quoted
accounting would be wrong. Instead, we should allow the tick_sched_timer in cases (c) and (d) to have runtime configurable period, and then scale the time value accordingly before passing to account_system_time. This is probably something the Xen folks will want also, since I think Xen itself only gets 100hz hard timer, and so it can implement at best a oneshot virtual timer with 100hz resolution. Any objections to us doing something like this?Yes. It's gross hackery. 1) We want to have a cleanup of the tick assumptions _all_ over the place and this is going to be real hard work. 2) As I said above. The time accounting for virtualization needs to be fixed in a generic way. I'm not going to accept some weird hackery for virtualization, which is of exactly ZERO value for the kernel itself. Quite the contrary it will make the cleanup harder and introduce another hard to remove thing, which will in the worst case last for ever.Okay, to confirm I'm on the same page as you, you want to move process time accounting from being periodic sampled based to being trace based? i.e. at the system-call/interrupt boundaries, read clocksource and compute directly the amount of system/user/process time?
At least for the paravirt guests this is the correct approach. Once the CPU vendors come up with a sane solution for a reliable and fast clock source we might use that on real hardware as well.
Do you know if anyone has explored this? I thought there was a discussion about this a while back but it was rejected due to the sample-based approach having much lower overheads on high system call rate workloads.
Yes, with todays hardware it is simply a PITA. PowerPC has some basic support for this though, IIRC. tglx