Thread (19 messages) 19 messages, 2 authors, 2024-01-19

Re: [PATCH v3 0/5] Rework system pressure interface to the scheduler

From: Dietmar Eggemann <dietmar.eggemann@arm.com>
Date: 2024-01-10 18:10:14
Also in: linux-arm-kernel, linux-arm-msm, linux-doc, linux-pm, lkml

On 09/01/2024 14:29, Vincent Guittot wrote:
On Tue, 9 Jan 2024 at 12:34, Dietmar Eggemann [off-list ref] wrote:
quoted
On 08/01/2024 14:48, Vincent Guittot wrote:
quoted
Following the consolidation and cleanup of CPU capacity in [1], this serie
reworks how the scheduler gets the pressures on CPUs. We need to take into
account all pressures applied by cpufreq on the compute capacity of a CPU
for dozens of ms or more and not only cpufreq cooling device or HW
mitigiations. we split the pressure applied on CPU's capacity in 2 parts:
- one from cpufreq and freq_qos
- one from HW high freq mitigiation.

The next step will be to add a dedicated interface for long standing
capping of the CPU capacity (i.e. for seconds or more) like the
scaling_max_freq of cpufreq sysfs. The latter is already taken into
account by this serie but as a temporary pressure which is not always the
best choice when we know that it will happen for seconds or more.
I guess this is related to the 'user space system pressure' (*) slide of
your OSPM '23 talk.
yes
quoted
Where do you draw the line when it comes to time between (*) and the
'medium pace system pressure' (e.g. thermal and FREQ_QOS).
My goal is to consider the /sys/../scaling_max_freq as the 'user space
system pressure'
quoted
IIRC, with (*) you want to rebuild the sched domains etc.
The easiest way would be to rebuild the sched_domain but the cost is
not small so I would prefer to skip the rebuild and add a new signal
that keep track on this capped capacity
Are you saying that you don't need to rebuild sched domains since
cpu_capacity information of the sched domain hierarchy is
independently updated via: 

update_sd_lb_stats() {

  update_group_capacity() {

    if (!child)
      update_cpu_capacity(sd, cpu) {

        capacity = scale_rt_capacity(cpu) {

          max = get_actual_cpu_capacity(cpu) <- (*)
        }

        sdg->sgc->capacity = capacity;
        sdg->sgc->min_capacity = capacity;
        sdg->sgc->max_capacity = capacity;
      }

  }

}
        
(*) influence of temporary and permanent (to be added) frequency
pressure on cpu_capacity (per-cpu and in sd data)


example: hackbench on h960 with IPA:
                                                                                  cap  min  max
...
hackbench-2284 [007] .Ns..  2170.796726: update_group_capacity: sdg !child cpu=7 1017 1017 1017
hackbench-2456 [007] ..s..  2170.920729: update_group_capacity: sdg !child cpu=7 1018 1018 1018
    <...>-2314 [007] ..s1.  2171.044724: update_group_capacity: sdg !child cpu=7 1011 1011 1011
hackbench-2541 [007] ..s..  2171.168734: update_group_capacity: sdg !child cpu=7  918  918  918
hackbench-2558 [007] .Ns..  2171.228716: update_group_capacity: sdg !child cpu=7  912  912  912
    <...>-2321 [007] ..s..  2171.352718: update_group_capacity: sdg !child cpu=7  812  812  812
hackbench-2553 [007] ..s..  2171.476721: update_group_capacity: sdg !child cpu=7  640  640  640
    <...>-2446 [007] ..s2.  2171.600743: update_group_capacity: sdg !child cpu=7  610  610  610
hackbench-2347 [007] ..s..  2171.724738: update_group_capacity: sdg !child cpu=7  406  406  406
hackbench-2331 [007] .Ns1.  2171.848768: update_group_capacity: sdg !child cpu=7  390  390  390
hackbench-2421 [007] ..s..  2171.972733: update_group_capacity: sdg !child cpu=7  388  388  388
...
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help