Thread (66 messages) 66 messages, 8 authors, 2014-07-18

[PATCH v3 01/12] sched: fix imbalance flag reset

From: vincent.guittot@linaro.org (Vincent Guittot)
Date: 2014-07-10 09:14:55
Also in: lkml

On 9 July 2014 12:43, Peter Zijlstra [off-list ref] wrote:
On Wed, Jul 09, 2014 at 09:24:54AM +0530, Preeti U Murthy wrote:
[snip]
quoted
Continuing with the above explanation; when LBF_ALL_PINNED flag is
set,and we jump to out_balanced, we clear the imbalance flag for the
sched_group comprising of cpu0 and cpu1,although there is actually an
imbalance. t2 could still be migrated to say cpu2/cpu3 (t2 has them in
its cpus allowed mask) in another sched group when load balancing is
done at the next sched domain level.
And this is where Vince is wrong; note how
update_sg_lb_stats()/sg_imbalance() uses group->sgc->imbalance, but
load_balance() sets: sd_parent->groups->sgc->imbalance, so explicitly
one level up.
I forgot this behavior when studying preeti use case
So what we can do I suppose is clear 'group->sgc->imbalance' at
out_balanced.

In any case, the entirely of this group imbalance crap is just that,
crap. Its a terribly difficult situation and the current bits more or
less fudge around some of the common cases. Also see the comment near
sg_imbalanced(). Its not a solid and 'correct' anything. Its a bunch of
hacks trying to deal with hard cases.

A 'good' solution would be prohibitively expensive I fear.
I have tried to summarized several use cases that have been discussed
for this patch

The 1st use case is the one that i described in the commit message of
this patch: If we have a sporadic imbalance that set the imbalance
flag, we don't clear it after and it generates spurious and useless
active load balance

Then preeti came with the following use case :
we have a sched_domain made of CPU0 and CPU1 in 2 different sched_groups
2 tasks A and B are on CPU0, B can't run on CPU1, A is the running task.
When CPU1's sched_group is doing load balance, the imbalance should be
set. That's still happen with this patchset because the LBF_ALL_PINNED
flag will be cleared thanks to task A.

Preeti also explained me the following use cases on irc:

If we have both tasks A and B that can't run on CPU1, the
LBF_ALL_PINNED will stay set. As we can't do anything, we conclude
that we are balanced, we go to out_balanced and we clear the imbalance
flag. But we should not consider that as a balanced state but as a all
tasks pinned state instead and we should let the imbalance flag set.
If we now have 2 additional CPUs which are in the cpumask of task A
and/or B at the parent sched_domain level , we should migrate one task
in this group but this will not happen (with this patch) because the
sched_group made of CPU0 and CPU1 is not overloaded (2 tasks for 2
CPUs) and the imbalance flag has been cleared as described previously.

I'm going to send a new revision of the patchset with the correction

Vincent
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help