Re: [PATCH 1/3] sched/fair: Prepare variables for increased precision of EAS estimated energy
From: Vincent Guittot <vincent.guittot@linaro.org>
Date: 2021-07-07 09:56:39
Also in:
lkml
On Wed, 7 Jul 2021 at 11:48, Lukasz Luba [off-list ref] wrote:
On 7/7/21 10:37 AM, Vincent Guittot wrote:quoted
On Wed, 7 Jul 2021 at 10:23, Lukasz Luba [off-list ref] wrote:quoted
On 7/7/21 9:00 AM, Vincent Guittot wrote:quoted
On Wed, 7 Jul 2021 at 09:49, Lukasz Luba [off-list ref] wrote:quoted
On 7/7/21 8:07 AM, Vincent Guittot wrote:quoted
On Fri, 25 Jun 2021 at 17:26, Lukasz Luba [off-list ref] wrote:quoted
The Energy Aware Scheduler (EAS) tries to find best CPU for a waking up task. It probes many possibilities and compares the estimated energy values for different scenarios. For calculating those energy values it relies on Energy Model (EM) data and em_cpu_energy(). The precision which is used in EM data is in milli-Watts (or abstract scale), which sometimes is not sufficient. In some cases it might happen that two CPUs from different Performance Domains (PDs) get the same calculated value for a given task placement, but in more precised scale, they might differ. This rounding error has to be addressed. This patch prepares EAS code for better precision in the coming EM improvements.Could you explain why 32bits results are not enough and you need to move to 64bits ? Right now the result is in the range [0..2^32[ mW. If you need more precision and you want to return uW instead, you will have a result in the range [0..4kW[ which seems to be still enoughCurrently we have the max value limit for 'power' in EM which is EM_MAX_POWER 0xffff (64k - 1). We allow to register such big power values ~64k mW (~64Watts) for an OPP. Then based on 'power' we pre-calculate 'cost' fields: cost[i] = power[i] * freq_max / freq[i] So, for max freq the cost == power. Let's use that in the example. Then the em_cpu_energy() calculates as follow: cost * sum_util / scale_cpu We are interested in the first part - the value of multiplication.But all these are internal computations of the energy model. At the end, the computed energy that is returned by compute_energy() and em_cpu_energy(), fits in a longLet's take a look at existing *10000 precision for x CPUs: cost * sum_util / scale_cpu = (64k *10000) * (x * 800) / 1024 which is: x * ~500mln So to be close to overflowing u32 the 'x' has to be > (?=) 8 (depends on sum_util).Sorry but I don't get your point. This patch is about the return type of compute_energy() and em_cpu_energy(). And even if we decide to return uW instead of mW, there is still a lot of margin. It's not because you need u64 for computing intermediate value that you must returns u64The example above shows the need of u64 return value for platforms which are: - 32bit - have e.g. 16 CPUs - has big power value e.g. ~64k mW Then let's to the calc: (64k * 10000) * (16 * 800) / 1024 = ~8000mln = ~8bln
so you return a power consumption of 8kW !!!
The returned value after applying the whole patch set won't fit in u32 for such cluster. We might make *assumption* that the 32bit platforms will not have bigger number of CPUs in the cluster or won't report big power values. But I didn't wanted to make such assumption.