Re: [PATCH 1/1] sched/fair: Fix unfairness caused by missing load decay
From: Odin Ugedal <hidden>
Date: 2021-05-01 14:34:26
Also in:
lkml
ons. 28. apr. 2021 kl. 17:36 skrev Vincent Guittot [off-list ref]:
You can keep both fixes tags
ACK
If the cfs_rq is already in the list list_add_leaf_cfs_rq() will exit early but if it's not, we don't have to make sure that the whole branch in the list
Yeah, thats right. Calling list_add_leaf_cfs_rq once "too much" doesnt hurt after all.
In fact, we can break as soon as list_add_leaf_cfs_rq() and cfs_rq_throttled() return true
ACK, that makes sense.
When a cfs_rq is throttled, it is not accounted in its parent anymore so we don't have to update and propagate the load down.
Okay. Still need to wrap my head around this a bit more I guess. I have looked a bit around, and there is actually a similar issue as "this one" for the case when a throttled cgroup is "moved" via cpuset. It is however waaay harder to reproduce, but it is doable, and it _will_ happen in real life systems if the timing is "correct". I will dig deeper and finish the patch for that case some time next week (hopefully). I think that however deserve a separate patchset, so I will come back with that later.
quoted hunk ↗ jump to hunk
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 33b1ee31ae0f..18441ce7316c 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c@@ -11026,10 +11026,10 @@ static void propagate_entity_cfs_rq(struct sched_entity *se) for_each_sched_entity(se) { cfs_rq = cfs_rq_of(se); - if (cfs_rq_throttled(cfs_rq)) - break; + if (!cfs_rq_throttled(cfs_rq)) + update_load_avg(cfs_rq, se, UPDATE_TG); - update_load_avg(cfs_rq, se, UPDATE_TG); + list_add_leaf_cfs_rq(cfs_rq); }}
Sent a v2 with something like this now; that exit if (list_add_leaf_cfs_rq(cfs_rq) && throttled). Since this loop start at the parent of the cfs_rq of the supplied se, I added a list_add_leaf_cfs_rq to the top in order to insert the leaf cfs_rq as well. Thanks Odin