Thread (35 messages) 35 messages, 5 authors, 2023-05-24

Re: [PATCH v5 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance

From: Vincent Guittot <vincent.guittot@linaro.org>
Date: 2021-09-17 07:41:37
Also in: linuxppc-dev

On Fri, 17 Sept 2021 at 03:01, Ricardo Neri
[off-list ref] wrote:
On Wed, Sep 15, 2021 at 05:43:44PM +0200, Vincent Guittot wrote:
quoted
On Sat, 11 Sept 2021 at 03:19, Ricardo Neri
[off-list ref] wrote:
quoted
When deciding to pull tasks in ASYM_PACKING, it is necessary not only to
check for the idle state of the destination CPU, dst_cpu, but also of
its SMT siblings.

If dst_cpu is idle but its SMT siblings are busy, performance suffers
if it pulls tasks from a medium priority CPU that does not have SMT
siblings.

Implement asym_smt_can_pull_tasks() to inspect the state of the SMT
siblings of both dst_cpu and the CPUs in the candidate busiest group.

Cc: Aubrey Li <redacted>
Cc: Ben Segall <bsegall@google.com>
Cc: Daniel Bristot de Oliveira <redacted>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Quentin Perret <redacted>
Cc: Rafael J. Wysocki <redacted>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Chen <redacted>
Reviewed-by: Joel Fernandes (Google) <redacted>
Reviewed-by: Len Brown <redacted>
Signed-off-by: Ricardo Neri <redacted>
---
Changes since v4:
  * Use sg_lb_stats::sum_nr_running the idle state of a scheduling group.
    (Vincent, Peter)
  * Do not even idle CPUs in asym_smt_can_pull_tasks(). (Vincent)
  * Updated function documentation and corrected a typo.

Changes since v3:
  * Removed the arch_asym_check_smt_siblings() hook. Discussions with the
    powerpc folks showed that this patch should not impact them. Also, more
    recent powerpc processor no longer use asym_packing. (PeterZ)
  * Removed unnecessary local variable in asym_can_pull_tasks(). (Dietmar)
  * Removed unnecessary check for local CPUs when the local group has zero
    utilization. (Joel)
  * Renamed asym_can_pull_tasks() as asym_smt_can_pull_tasks() to reflect
    the fact that it deals with SMT cases.
  * Made asym_smt_can_pull_tasks() return false for !CONFIG_SCHED_SMT so
    that callers can deal with non-SMT cases.

Changes since v2:
  * Reworded the commit message to reflect updates in code.
  * Corrected misrepresentation of dst_cpu as the CPU doing the load
    balancing. (PeterZ)
  * Removed call to arch_asym_check_smt_siblings() as it is now called in
    sched_asym().

Changes since v1:
  * Don't bailout in update_sd_pick_busiest() if dst_cpu cannot pull
    tasks. Instead, reclassify the candidate busiest group, as it
    may still be selected. (PeterZ)
  * Avoid an expensive and unnecessary call to cpumask_weight() when
    determining if a sched_group is comprised of SMT siblings.
    (PeterZ).
---
 kernel/sched/fair.c | 94 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 94 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 26db017c14a3..8d763dd0174b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8597,10 +8597,98 @@ group_type group_classify(unsigned int imbalance_pct,
        return group_has_spare;
 }

+/**
+ * asym_smt_can_pull_tasks - Check whether the load balancing CPU can pull tasks
+ * @dst_cpu:   Destination CPU of the load balancing
+ * @sds:       Load-balancing data with statistics of the local group
+ * @sgs:       Load-balancing statistics of the candidate busiest group
+ * @sg:                The candidate busiest group
+ *
+ * Check the state of the SMT siblings of both @sds::local and @sg and decide
+ * if @dst_cpu can pull tasks.
+ *
+ * If @dst_cpu does not have SMT siblings, it can pull tasks if two or more of
+ * the SMT siblings of @sg are busy. If only one CPU in @sg is busy, pull tasks
+ * only if @dst_cpu has higher priority.
+ *
+ * If both @dst_cpu and @sg have SMT siblings, and @sg has exactly one more
+ * busy CPU than @sds::local, let @dst_cpu pull tasks if it has higher priority.
+ * Bigger imbalances in the number of busy CPUs will be dealt with in
+ * update_sd_pick_busiest().
+ *
+ * If @sg does not have SMT siblings, only pull tasks if all of the SMT siblings
+ * of @dst_cpu are idle and @sg has lower priority.
+ */
+static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,
+                                   struct sg_lb_stats *sgs,
+                                   struct sched_group *sg)
+{
+#ifdef CONFIG_SCHED_SMT
+       bool local_is_smt, sg_is_smt;
+       int sg_busy_cpus;
+
+       local_is_smt = sds->local->flags & SD_SHARE_CPUCAPACITY;
+       sg_is_smt = sg->flags & SD_SHARE_CPUCAPACITY;
+
+       sg_busy_cpus = sgs->group_weight - sgs->idle_cpus;
+
+       if (!local_is_smt) {
+               /*
+                * If we are here, @dst_cpu is idle and does not have SMT
+                * siblings. Pull tasks if candidate group has two or more
+                * busy CPUs.
+                */
+               if (sg_is_smt && sg_busy_cpus >= 2)
Do you really need to test sg_is_smt ? if sg_busy_cpus >= 2 then
sd_is_smt must be true ?
Thank you very much for your feedback Vincent!

Yes, it is true that sg_busy_cpus >=2 is only true if @sg is SMT. I will
remove this check.
quoted
Also, This is the default behavior where we want to even the number of
busy cpu. Shouldn't you return false and fall back to the default
behavior ?
This is also true.
quoted
That being said, the default behavior tries to even the number of idle
cpus which is easier to compute and is equal to even the number of
busy cpus in "normal" system with the same number of cpus in groups
but this is not the case here. It could be good to change the default
behavior to even the number of busy cpus and that you use the default
behavior here. Additional condition will be used to select the busiest
group like more busy cpu or more number of running tasks
That is a very good observation. Checking the number of idle CPUs
assumes that both groups have the same number of CPUs. I'll look into
modifying the default behavior.
quoted
quoted
+                       return true;
+
+               /*
+                * @dst_cpu does not have SMT siblings. @sg may have SMT
+                * siblings and only one is busy. In such case, @dst_cpu
+                * can help if it has higher priority and is idle (i.e.,
+                * it has no running tasks).
The previous comment above assume that "@dst_cpu is idle" but now you
need to check that sds->local_stat.sum_nr_running == 0
But we already know that, right? We are here because in
update_sg_lb_stats() we determine that dst CPU is idle (env->idle !=
CPU_NOT_IDLE).
That's my point:
Why do you add the condition !sds->local_stat.sum_nr_running below ? I
assume that it's to check that the cpu is idle, isn't it ?
quoted
quoted
+                */
+               return !sds->local_stat.sum_nr_running &&
+                      sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
+       }
Thanks and BR,
Ricardo
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help