Re: [PATCH v3 3/6] cpufreq: Add an interface to mark inefficient frequencies
From: "Rafael J. Wysocki" <rafael@kernel.org>
Date: 2021-07-05 14:09:26
On Fri, Jul 2, 2021 at 9:17 PM Vincent Donnefort [off-list ref] wrote:
[...]quoted
quoted
quoted
I'm guessing that the problem is that cpufreq_cooling works by using freq_qos_update_request() to update the max frequency limit and if that is in effect you'd rather use the inefficient frequencies, whereas when the governor selects an inefficient frequency below the policy limit, you'd rather use a higher-but-efficient frequency instead (within the policy limit). Am I guessing correctly?Yes, correct. Thermal would use all (efficient + inefficient), but we in cpufreq governor would like to pick if possible the efficient one (below the thermal limit).To address that, you need to pass more information from schedutil to __cpufreq_driver_target() that down the road can be used by cpufreq_frequency_table_target() to decide whether or not to skip the inefficient frequencies. For example, you can define CPUFREQ_RELATION_EFFICIENT and pass it from schedutil to __cpufreq_driver_target() in the "relation" argument, and clear it if the target frequency is above the max policy limit, or if ->target() is to be called.What about a cpufreq_policy option that if sets would make cpufreq_frequency_table_target() skip inefficient OPPs while staying within the limit of max policy?
That would work too, ->
Each governor could decide to set it or not, but it would hide the efficiency resolution to the governor and allow drivers that implements ->target() to also implements support for inefficient OPPs.
-> but alternatively there could be an additional cpufreq driver flag to be set by the drivers implementing ->target() and wanting to deal with CPUFREQ_RELATION_EFFICIENT themselves (an opt-in of sorts). So the governors that want it may pass CPUFREQ_RELATION_EFFICIENT to __cpufreq_driver_target() and then it will be passed to ->target() depending on whether or not the new driver flag is set.
That flag could be set according to a new cpufreq_governor flag CPUFREQ_GOV_SKIP_INEFFICIENCIES? That could though modify behaviors like powersave_bias from ondemand. But if a frequency is inefficient, there's probably no power saving anyway.
AFAICS, the userspace governor aside, using inefficient frequencies only works with the powersave governor. In the other cases, RELATION_L (say) can be interpreted as "the closest efficient frequency equal to or above the target" with the max policy limit possibly causing inefficient frequencies to be used if they are closer to the limit than the next efficient one. As a rule, the governors don't assume that there are any inefficient frequencies in the table. In fact, they don't make any assumptions regarding the contents of the frequency table at all. They don't even assume that the driver uses a frequency table in the first place.