Re: [PATCH v3 3/6] cpufreq: Add an interface to mark inefficient frequencies

From: Lukasz Luba <lukasz.luba@arm.com>
Date: 2021-07-02 19:04:42


On 7/2/21 6:53 PM, Rafael J. Wysocki wrote:

On Fri, Jul 2, 2021 at 6:08 PM Lukasz Luba [off-list ref] wrote:

quoted



On 7/2/21 5:04 PM, Rafael J. Wysocki wrote:

quoted

On Fri, Jul 2, 2021 at 5:46 PM Rafael J. Wysocki [off-list ref] wrote:

quoted

On Fri, Jul 2, 2021 at 4:21 PM Lukasz Luba [off-list ref] wrote:

quoted

Hi Rafael,

This is a gentle ping.
You have probably not seen this discussion thread.

I have looked at it briefly for a few times, but not too much into detail.

quoted

On 6/16/21 1:45 PM, Lukasz Luba wrote:

quoted


On 6/16/21 11:53 AM, Viresh Kumar wrote:

quoted

On 16-06-21, 11:33, Lukasz Luba wrote:

quoted

On 6/16/21 10:31 AM, Viresh Kumar wrote:

quoted

On 16-06-21, 10:03, Lukasz Luba wrote:
Clean is not lesser number of lines for me, but rather having the
right ownership of such things.

Some developers do like patches which removes more lines then adds ;)

:)

[snip]

quoted

What could be the modified v1 [2]:
- LUT which holds two IDs: efficient, inefficient, take one
     according to the clamp f_max
- add new argument 'policy->max' to em_pd_get_efficient_freq()

freq = em_pd_get_efficient_freq(em_cpu_get(policy->cpu), freq,
policy->max);

The problem was that EAS couldn't know the clamp freq_max,
which shouldn't be the blocker now.

If you can do that without adding any EM specific stuff in the cpufreq
core, I will mostly be fine.

But honestly speaking, creating more data structures to keep related
information doesn't scale well.

We already have so many tables for keeping freq/voltage pairs, OPP,
cpufreq, EM. You tried to add one more in EM I think V1, not sure.

It is always better to consolidate and we almost reached to a point
where that could have been done very easily. I understand that you
didn't want to touch so many different parts, but anyway..

Yes, I don't want to touch some many generic parts because they
are intended to be generic. This heuristic is only for EM platforms,
which are arm, arm64 battery powered (not servers or other).
Thus, I wanted to keep it locally. The cost of EM extra structures
(the LUT) will only be for platforms for which EM discovers that
they have inefficient performance levels.
The code would even not be compiled in for x86, ppc, etc, in hot path.

quoted

this v3 and your proposal.

IMHO, adding such callbacks to the EM core, like .mark_efficient(),
will only make this easier to handle for all different frameworks, and
not otherwise. The code will look much cleaner everywhere..

What about coming back to the slightly modified v1 idea?
That was really self-contained modification for this
inefficient opps heuristic.

I am not sure if I really understand what that would be, but again
adding another table is going to create more problems then it should.

IMHO your proposals are more invasive for the generic code, while
not always being even used. Plenty of architectures don't even set EM,
even arm64 for servers doesn't use it. You and Rafael would have to
maintain these modifications in generic code. It might be hard to remove
it. While I recommend to keep this heuristic feature inside the EM and
we will maintain it. If we decide after a few years that new arm64
platforms use some smarter FW for performance level, we might just
disable this heuristic.

quoted

Anyway, that's my view, which can be wrong as well.

Rafael: You have any suggestions here ?

I would also like to hear Rafael's opinion :)

It would be great if you could have a look.
I will be grateful for your time spend on it and opinion.

First of all, IMO checking whether or not a given frequency is
"efficient" doesn't belong to cpufreq governors.  The governor knows
what the min and max supported frequencies are and selects one from
that range and note that it doesn't even check whether or not the
selected frequency is present in the frequency table.  That part
belongs to the driver or the general frequency table handling in the
cpufreq core.

So the governor doesn't care and it shouldn't care IMO.

Besides, in the cpufreq_driver_fast_switch() case the driver's
->fast_switch() callback is entirely responsible for applying the
selected frequency, so I'm not sure how this "efficient" thing is
going to work then?

Anyway, since we are talking about frequency tables, that would be the
__cpufreq_driver_target() case only and specifically in the
__target_index() case only (note how far away this is from the
governor).

Now, you may think about modifying cpufreq_frequency_table_target() to
skip "inefficient" frequencies, but then the question is why those
frequencies need to be there in the frequency table in the first
place?

I'm guessing that the problem is that cpufreq_cooling works by using
freq_qos_update_request() to update the max frequency limit and if
that is in effect you'd rather use the inefficient frequencies,
whereas when the governor selects an inefficient frequency  below the
policy limit, you'd rather use a higher-but-efficient frequency
instead (within the policy limit).

Am I guessing correctly?

Yes, correct. Thermal would use all (efficient + inefficient), but
we in cpufreq governor would like to pick if possible the efficient
one (below the thermal limit).

To address that, you need to pass more information from schedutil to
__cpufreq_driver_target() that down the road can be used by
cpufreq_frequency_table_target() to decide whether or not to skip the
inefficient frequencies.

For example, you can define CPUFREQ_RELATION_EFFICIENT and pass it
from schedutil to __cpufreq_driver_target() in the "relation"
argument, and clear it if the target frequency is above the max policy
limit, or if ->target() is to be called.

Thank you Rafael for valuable comments. We will have to experiment
with that option and come back with implementation based on it.

Regards,
Lukasz

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help