Re: [PATCH v3 3/6] cpufreq: Add an interface to mark inefficient frequencies
From: Lukasz Luba <lukasz.luba@arm.com>
Date: 2021-07-02 19:04:42
On 7/2/21 6:53 PM, Rafael J. Wysocki wrote:
On Fri, Jul 2, 2021 at 6:08 PM Lukasz Luba [off-list ref] wrote:quoted
On 7/2/21 5:04 PM, Rafael J. Wysocki wrote:quoted
On Fri, Jul 2, 2021 at 5:46 PM Rafael J. Wysocki [off-list ref] wrote:quoted
On Fri, Jul 2, 2021 at 4:21 PM Lukasz Luba [off-list ref] wrote:quoted
Hi Rafael, This is a gentle ping. You have probably not seen this discussion thread.I have looked at it briefly for a few times, but not too much into detail.quoted
On 6/16/21 1:45 PM, Lukasz Luba wrote:quoted
On 6/16/21 11:53 AM, Viresh Kumar wrote:quoted
On 16-06-21, 11:33, Lukasz Luba wrote:quoted
On 6/16/21 10:31 AM, Viresh Kumar wrote:quoted
On 16-06-21, 10:03, Lukasz Luba wrote: Clean is not lesser number of lines for me, but rather having the right ownership of such things.Some developers do like patches which removes more lines then adds ;):)[snip]quoted
quoted
What could be the modified v1 [2]: - LUT which holds two IDs: efficient, inefficient, take one according to the clamp f_max - add new argument 'policy->max' to em_pd_get_efficient_freq() freq = em_pd_get_efficient_freq(em_cpu_get(policy->cpu), freq, policy->max); The problem was that EAS couldn't know the clamp freq_max, which shouldn't be the blocker now.If you can do that without adding any EM specific stuff in the cpufreq core, I will mostly be fine. But honestly speaking, creating more data structures to keep related information doesn't scale well. We already have so many tables for keeping freq/voltage pairs, OPP, cpufreq, EM. You tried to add one more in EM I think V1, not sure. It is always better to consolidate and we almost reached to a point where that could have been done very easily. I understand that you didn't want to touch so many different parts, but anyway..Yes, I don't want to touch some many generic parts because they are intended to be generic. This heuristic is only for EM platforms, which are arm, arm64 battery powered (not servers or other). Thus, I wanted to keep it locally. The cost of EM extra structures (the LUT) will only be for platforms for which EM discovers that they have inefficient performance levels. The code would even not be compiled in for x86, ppc, etc, in hot path.quoted
quoted
quoted
quoted
this v3 and your proposal.IMHO, adding such callbacks to the EM core, like .mark_efficient(), will only make this easier to handle for all different frameworks, and not otherwise. The code will look much cleaner everywhere..What about coming back to the slightly modified v1 idea? That was really self-contained modification for this inefficient opps heuristic.I am not sure if I really understand what that would be, but again adding another table is going to create more problems then it should.IMHO your proposals are more invasive for the generic code, while not always being even used. Plenty of architectures don't even set EM, even arm64 for servers doesn't use it. You and Rafael would have to maintain these modifications in generic code. It might be hard to remove it. While I recommend to keep this heuristic feature inside the EM and we will maintain it. If we decide after a few years that new arm64 platforms use some smarter FW for performance level, we might just disable this heuristic.quoted
Anyway, that's my view, which can be wrong as well. Rafael: You have any suggestions here ?I would also like to hear Rafael's opinion :)It would be great if you could have a look. I will be grateful for your time spend on it and opinion.First of all, IMO checking whether or not a given frequency is "efficient" doesn't belong to cpufreq governors. The governor knows what the min and max supported frequencies are and selects one from that range and note that it doesn't even check whether or not the selected frequency is present in the frequency table. That part belongs to the driver or the general frequency table handling in the cpufreq core. So the governor doesn't care and it shouldn't care IMO. Besides, in the cpufreq_driver_fast_switch() case the driver's ->fast_switch() callback is entirely responsible for applying the selected frequency, so I'm not sure how this "efficient" thing is going to work then? Anyway, since we are talking about frequency tables, that would be the __cpufreq_driver_target() case only and specifically in the __target_index() case only (note how far away this is from the governor). Now, you may think about modifying cpufreq_frequency_table_target() to skip "inefficient" frequencies, but then the question is why those frequencies need to be there in the frequency table in the first place?I'm guessing that the problem is that cpufreq_cooling works by using freq_qos_update_request() to update the max frequency limit and if that is in effect you'd rather use the inefficient frequencies, whereas when the governor selects an inefficient frequency below the policy limit, you'd rather use a higher-but-efficient frequency instead (within the policy limit). Am I guessing correctly?Yes, correct. Thermal would use all (efficient + inefficient), but we in cpufreq governor would like to pick if possible the efficient one (below the thermal limit).To address that, you need to pass more information from schedutil to __cpufreq_driver_target() that down the road can be used by cpufreq_frequency_table_target() to decide whether or not to skip the inefficient frequencies. For example, you can define CPUFREQ_RELATION_EFFICIENT and pass it from schedutil to __cpufreq_driver_target() in the "relation" argument, and clear it if the target frequency is above the max policy limit, or if ->target() is to be called.
Thank you Rafael for valuable comments. We will have to experiment with that option and come back with implementation based on it. Regards, Lukasz