cpufreq: frequency scaling spec in DT node
From: Mason <hidden>
Date: 2017-06-29 11:41:46
Also in:
linux-pm
On 29/06/2017 12:04, Viresh Kumar wrote:
On 29-06-17, 11:48, Mason wrote:quoted
I have two similar, but slightly different SoCs. Firmware/bootloader sets the "nominal" CPU frequency toSo nominal here is MAX cpu frequency.quoted
- 1215 MHz on SoC A - 1206 MHz on SoC B On both systems, software can reduce the CPU frequency by writing an 8-bit integer divider to an MMIO register. Originally, I wanted to define a small number of operating points, defined only by the divider value, and compute the actual OPP freq at init. For example, use { 1, 2, 3, 5, 9 } for dividers => 1215, 607.5, 405, 243, 135 on SoC A 1206, 603, 402, 241.2, 134 on Soc B I'm using the generic cpufreq driver. Binding for the generic cpufreq driver: https://www.kernel.org/doc/Documentation/devicetree/bindings/cpufreq/cpufreq-dt.txt I don't think there's a way to do what I want with the existing driver, right?No, you should rather use actual target frequency values.quoted
It's not a big deal, I can write the actual target frequencies in the DT.Right.quoted
(BTW, the OPPs are more SW than HW desc, right?)Hmm, I wouldn't say that exactly :) What OPP contains is mostly defined by hardware, apart from the frequency values we are talking about. And those are decided by the boot loaders and they are like hardware to the kernel really. They define hardware capabilities IOW. If you want, you can actually try implementing a ->target() type cpufreq driver instead of ->target_index() and you will be able to select any frequency you want. But with the above example, what you can select is Max divided by integer value and so you can have 9 different OPPs and reuse cpufreq-dt.quoted
But my problem is: what happens if firmware/bootloader is changed without me knowing, and they change the nominal frequency?The kernel doesn't have any authority over what frequencies we are allowed to use and we depend on the boot loader for that. If someone changes that, screw him :)quoted
Because of the rounding, if the nominal freq is slightly increased, the SoC will start working atdecreased ?quoted
*slower* speeds. For example, if nominal is 1215, and I request 603, I will actually get 405.No, you will normally get a frequency >= requested frequency with the cpufreq governors we have.quoted
This effect can be seen if I define SoC B OPPs on SoC A: $ cat scaling_available_frequencies 134000 241200 402000 603000 1206000 /sys/devices/system/cpu/cpu0/cpufreq$ echo 603000 > scaling_max_freqWow. This is not how you request a frequency. What you said here is that the MAX frequency allowed now is 603000 instead of 1206000. And because 603000 isn't a valid frequency, we go down to 405000. So, you should try using the userspace governor and play with scaling_setspeed sysfs file.
I was trying to "emulate" the behavior of the ondemand governor. Based on your reaction, I got it wrong... Here is the actual issue: I'm on SoC B, where nominal/max freq is expected to be 1206 MHz. So the OPPs in the DT are: operating-points = <1206000 0 603000 0 402000 0 241200 0 134000 0>; *But* FW changed the max freq behind my back, to 1215 MHz. Here is what happens when I execute: echo ondemand >scaling_governor sleep 2 cpuburn-a9 & cpuburn-a9 & cpuburn-a9 & cpuburn-a9 ### cpuburn-a9 spins in a tight infinite loop, ### hitting all FUs to raise the CPU temperature # cpufreq_test.sh [ 69.933874] set_target: index=4 [ 69.944799] set_target: index=2 [ 69.947988] clk_divider_set_rate: rate=303750000 parent_rate=1215000000 div=4 [ 69.955542] set_target: index=4 [ 69.958801] clk_divider_set_rate: rate=607500000 parent_rate=1215000000 div=2 [ 69.984789] set_target: index=0 [ 69.987980] clk_divider_set_rate: rate=121500000 parent_rate=1215000000 div=10 [ 71.947597] set_target: index=4 [ 71.950996] clk_divider_set_rate: rate=607500000 parent_rate=1215000000 div=2 As you can see, the divider remains stuck at 2, so the SoC is actually running only at 607.5 MHz (instead of 1215 MHz). If I fix the OPPs in DT to: operating-points = <1215000 0 607500 0 405000 0 243000 0 135000 0>; Then I get the expected behavior: $ cpufreq_test.sh [ 32.717930] set_target: index=1 [ 32.721131] clk_divider_set_rate: rate=243000000 parent_rate=1215000000 div=5 [ 32.731326] set_target: index=4 [ 32.734521] clk_divider_set_rate: rate=1215000000 parent_rate=1215000000 div=1 [ 32.754556] set_target: index=0 [ 32.757738] clk_divider_set_rate: rate=135000000 parent_rate=1215000000 div=9 [ 32.765864] set_target: index=4 [ 32.769217] clk_divider_set_rate: rate=1215000000 parent_rate=1215000000 div=1 [ 33.438811] set_target: index=0 [ 33.442001] clk_divider_set_rate: rate=135000000 parent_rate=1215000000 div=9 [ 33.450249] set_target: index=4 [ 33.453470] clk_divider_set_rate: rate=1215000000 parent_rate=1215000000 div=1 [ 33.477888] set_target: index=0 [ 33.481067] clk_divider_set_rate: rate=135000000 parent_rate=1215000000 div=9 [ 34.714786] set_target: index=4 [ 34.718237] clk_divider_set_rate: rate=1215000000 parent_rate=1215000000 div=1 Divider settles at 1 (full speed) to provide maximum performance for the user-space processes. My concern is that if I don't check somewhere that the nominal frequency is as expected in the DT, the CPU might run slower than expected (max freq cut in half).
quoted
[ 60.401883] set_target: index=3 [ 60.405118] clk_divider_set_rate: rate=405000000 parent_rate=1215000000 div=3 What can I do against that? Should I check the nominal frequency in my clk driver? (I'm not sure reading properties of unrelated nodes is acceptable practice.)We rely on the boot loader to get these details. There is one thing you can do to avoid adding OPP entries in the DT. You can rather add them dynamically with help of: dev_pm_opp_add() and cpufreq-dt will continue to work with that too.
In what driver should I call these... the clk driver? (drivers/clk/tegra/cvb.c seems to be doind that) A problem might arise when I need to do voltage scaling, though, since I also need to specify voltages, right?
But you should understand how to use the sysfs interface first and make sure you are doing the right thing.
You're talking about this document, right? https://www.kernel.org/doc/Documentation/cpu-freq/user-guide.txt Regards.