Thread (17 messages) 17 messages, 4 authors, 2019-07-04

Re: [RFCv2 0/8] Add imx8mm bus frequency switching

From: Saravana Kannan <hidden>
Date: 2019-07-04 03:03:03
Also in: linux-clk, linux-pm

On Wed, Jul 3, 2019 at 4:30 PM Leonard Crestez [off-list ref] wrote:
On 7/4/2019 1:20 AM, Saravana Kannan wrote:
quoted
quoted
The interconnect and devfreq parts do not communicate explicitly: they both
just call clk_set_min_rate and the clk core picks the minimum value that can
satisfy both. They are thus completely independent.
Two different parts not talking to each other and just setting min
rate can cause problems for some concurrency use cases. ICC framework
is explicitly designed to handle cases like this and aggregate the
needs correctly. You might want to look into that more closely.
The clk framework aggregates the min_rate requests from multiple
consumers under a big "prepare_lock" so I expect it will deal with
concurrent requests correctly. As for performance: frequency switching
shouldn't be a fast path.
Sorry I wasn't clear. I was not talking about locking issues or race
conditions when I said concurrency use cases. What I meant was if GPU
wants 5 GB/s and at the same time (concurrent) camera wants 5 GB/s
you'll need to configure the interconnect for 10 GB/s. If both of them
just try to set the min freq equivalent for 5 GB/s the performance
would be bad or functionality might break.
quoted
quoted
The clk_set_min_rate approach does not mesh very well with the OPP framework.
Some of interconnect nodes on imx8m can run at different voltages: OPP can
handle this well but not in response to a clk_set_min_rate from an unrelated
subsystem. Maybe set voltage on a clk notifier?
I think if you design it something like below, it might make your life
a whole lot easier.
Hopefully the formatting looks okay on your end. The arrow going up is
just connecting devfreq to ICC.

                         Proactive -> ICC -> clk/OPP API to set freq/volt
                                       ^
                                       |
HW measure -> governor -> devfreq ----+

That way, all voltage/frequency requirements are managed cleanly by
clk/OPP frameworks. The ICC deals with aggregating all the
requirements and devfreq lets you handle reactive scaling and policy.
If icc and devfreq are to directly communicate it makes more sense to do
it backwards: ICC should set a min_freq on nodes which have a devfreq
instance attached and devfreq should implement actual freq switching.

HW measurement is done on individual nodes while ICC deals with requests
along a path. In particular on imx we have a performance monitor
attached to the ddr controller and I doubt it can distinguish between
masters so how could this be mapped usefully to an interconnect request?
Ah, that was the missing piece for me -- you are trying to use a
central performance monitor. I see what you are trying to do.

So you are looking at system wide traffic at DDR and then using that
to scale the interconnect/DDRC. I don't know how complicated or not
the IMX interconnect topology is, so please pardon my questions. If
you are using a performance monitor at the DDR controller, why do you
need the "proactive" requests from other clients? Wouldn't the
performance monitor account for all the traffic to DDR?
As far as I understand with devfreq the ddrc node could use "ondemand"
while the other nodes would default to the "passive" governor and run at
predefined default ratios relative to DDRC.
Yes, that approach would also work but I'm not sure why you need the
ICC framework in that case.
quoted
If all of this makes sense, please take a look at [2] and provide your
thoughts. I've dropped a few patches from [1] to avoid confusion (too
much going on at once). I think BW OPP tables and having OPP tables
for interconnect paths will be something you'll need (if not now,
eventually) and something you can build on top of nicely.
I found it very confusing that you're assigning BW OPP tables to
devices. My initial understanding was that BW OPP would map "bandwidth"
to "frequency" so BW OPPs should be assigned to links along the
interconnect graph. But maybe what you want is to have OPPs for the BW
values requested by devices?
I want to have OPPs for bandwidths requested for paths. Each
interconnect node can also use BW OPPs if that makes sense for them,
but I think they'd be better served by using frequency OPPs.
Looking at the sdm845 icc provider source and it seems that those
"peak/avg" values are actually parameters which go into a firmware
command!? It makes sense that you want interconnect to be "below"
devfreq since icc_provider.set maps very closely to what firmware exposes.
Even without the firmware (it's mainly there to aggregate requests for
some system wide resources) or if interconnects are scaled directly
using clock APIs (older chips), sdm845 would still want ICC to be
below devfreq. It's because 845 doesn't try to do ICC scaling by
measuring at the DDR. Each master makes separate requests and then the
ICC aggregates and sets the frequency. They have their reasons (good
ones) for doing that.
 > Interconnects and interconnect paths quantify their performance
levels > in terms of bandwidth and not in terms of frequency.

On i.MX we just have a bunch of interconnect IPs for which frequencies
can be adjusted (in hz)
This is true for every chip. In the end, they all set the Hz of hardware IPs.
so the above statement doesn't really hold. It
is up to an icc provider to convert aggregate bandwidth values to
frequencies along the path.
That's the job of every ICC provider. In the case of i.MX you use
clock APIs and in the case of sdm845 they use some complicated
firmware interface. But in both cases, when the system is tuned for
performance/power everyone in the end cares about frequency and
voltage. If the frequency goes too high, they might reduce the
bandwidth to make sure the frequency remains reasonable for whatever
use case they are profiling.

I think the main difference is where the performance monitoring or
performance decision is done. If you don't have a central performance
monitor or don't want to use one, then the policy decision (which is
where devfreq fits in) will have to happen at the bandwidth decision
level.

-Saravana

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help