Thread (108 messages) 108 messages, 8 authors, 2014-11-24

[PATCH v2 08/11] sched: get CPU's activity statistic

From: vincent.guittot@linaro.org (Vincent Guittot)
Date: 2014-06-04 11:07:53
Also in: lkml

On 4 June 2014 12:36, Morten Rasmussen [off-list ref] wrote:
On Wed, Jun 04, 2014 at 11:17:24AM +0100, Peter Zijlstra wrote:
quoted
On Wed, Jun 04, 2014 at 11:32:10AM +0200, Vincent Guittot wrote:
quoted
On 4 June 2014 10:08, Peter Zijlstra [off-list ref] wrote:
quoted
On Wed, Jun 04, 2014 at 09:47:26AM +0200, Vincent Guittot wrote:
quoted
On 3 June 2014 17:50, Peter Zijlstra [off-list ref] wrote:
quoted
On Wed, May 28, 2014 at 04:47:03PM +0100, Morten Rasmussen wrote:
quoted
Since we may do periodic load-balance every 10 ms or so, we will perform
a number of load-balances where runnable_avg_sum will mostly be
reflecting the state of the world before a change (new task queued or
moved a task to a different cpu). If you had have two tasks continuously
on one cpu and your other cpu is idle, and you move one of the tasks to
the other cpu, runnable_avg_sum will remain unchanged, 47742, on the
first cpu while it starts from 0 on the other one. 10 ms later it will
have increased a bit, 32 ms later it will be 47742/2, and 345 ms later
it reaches 47742. In the mean time the cpu doesn't appear fully utilized
and we might decide to put more tasks on it because we don't know if
runnable_avg_sum represents a partially utilized cpu (for example a 50%
task) or if it will continue to rise and eventually get to 47742.
Ah, no, since we track per task, and update the per-cpu ones when we
migrate tasks, the per-cpu values should be instantly updated.

If we were to increase per task storage, we might as well also track
running_avg not only runnable_avg.
I agree that the removed running_avg should give more useful
information about the the load of a CPU.

The main issue with running_avg is that it's disturbed by other tasks
(as point out previously). As a typical example,  if we have 2 tasks
with a load of 25% on 1 CPU, the unweighted runnable_load_avg will be
in the range of [100% - 50%] depending of the parallelism of the
runtime of the tasks whereas the reality is 50% and the use of
running_avg will return this value
I'm not sure I see how 100% is possible, but yes I agree that runnable
can indeed be inflated due to this queueing effect.
Let me explain the 75%, take any one of the above scenarios. Lets call
the two tasks A and B, and let for a moment assume A always wins and
runs first, and then B.

So A will be runnable for 25%, B otoh will be runnable the entire time A
is actually running plus its own running time, giving 50%. Together that
makes 75%.

If you release the assumption that A runs first, but instead assume they
equally win the first execution, you get them averaging at 37.5% each,
which combined will still give 75%.
But that is assuming that the first task gets to run to completion of it
busy period. If it uses up its sched_slice and we switch to the other
tasks, they both get to wait.

For example, if the sched_slice is 5 ms and the busy period is 10 ms,
the execution pattern would be: A, B, A, B, idle, ... In that case A is
runnable for 15 ms and B is for 20 ms. Assuming that the overall period
is 40 ms, the A runnable is 37.5% and B is 50%.
The exact value for your scheduling example above is:
A runnable will be 47% and B runnable will be 60% (unless i make a
mistake in my computation)
and CPU runnable will be 60% too

Vincent
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help