Re: [PATCH 1/3] sched/fair: Prepare variables for increased precision of EAS estimated energy
From: Lukasz Luba <lukasz.luba@arm.com>
Date: 2021-07-07 09:54:24
Also in:
lkml
On 7/7/21 10:45 AM, Dietmar Eggemann wrote:
On 07/07/2021 10:23, Lukasz Luba wrote:quoted
On 7/7/21 9:00 AM, Vincent Guittot wrote:quoted
On Wed, 7 Jul 2021 at 09:49, Lukasz Luba [off-list ref] wrote:quoted
On 7/7/21 8:07 AM, Vincent Guittot wrote:quoted
On Fri, 25 Jun 2021 at 17:26, Lukasz Luba [off-list ref] wrote:[...]quoted
quoted
quoted
quoted
Could you explain why 32bits results are not enough and you need to move to 64bits ? Right now the result is in the range [0..2^32[ mW. If you need more precision and you want to return uW instead, you will have a result in the range [0..4kW[ which seems to be still enoughCurrently we have the max value limit for 'power' in EM which is EM_MAX_POWER 0xffff (64k - 1). We allow to register such big power values ~64k mW (~64Watts) for an OPP. Then based on 'power' we pre-calculate 'cost' fields: cost[i] = power[i] * freq_max / freq[i] So, for max freq the cost == power. Let's use that in the example. Then the em_cpu_energy() calculates as follow: cost * sum_util / scale_cpu We are interested in the first part - the value of multiplication.But all these are internal computations of the energy model. At the end, the computed energy that is returned by compute_energy() and em_cpu_energy(), fits in a longLet's take a look at existing *10000 precision for x CPUs: cost * sum_util / scale_cpu = (64k *10000) * (x * 800) / 1024 which is: x * ~500mln So to be close to overflowing u32 the 'x' has to be > (?=) 8 (depends on sum_util).I assume the worst case is `x * 1024` (max return value of effective_cpu_util = effective_cpu_util()) so x ~ 6.7. I'm not aware of any arm32 b.L. systems with > 4 CPUs in a PD.
True, arm32 didn't support bigger number than 4 CPUs in the cluster. We would be safe for them, but I don't want to break with this assumption any other 32bit platform from competitors, which might create such 32bit 16cores clusters. If Peter, Vincent and you are OK to put this assumption about max safe CPUs number, then we can get rid of patch 1/3. But the temporary division of u64 must stay, because there is arm32 platform which need it. So returning also u64 is not a big harm and looks more consistent.