Thread (10 messages) 10 messages, 2 authors, 2021-01-30

RE: [PATCH] thermal/intel: introduce tcc cooling driver

From: "Zhang, Rui" <rui.zhang@intel.com>
Date: 2021-01-18 10:47:09

Hi, Doug,

Thanks for testing this patch.
-----Original Message-----
From: Doug Smythies <redacted>
Sent: Sunday, January 17, 2021 1:08 AM
To: Zhang, Rui <rui.zhang@intel.com>
Cc: daniel.lezcano@linaro.org; srinivas.pandruvada@linux.intel.com; linux-
pm@vger.kernel.org
Subject: RE: [PATCH] thermal/intel: introduce tcc cooling driver
Importance: High

On 2021.01.15 Zhang Rui wrote:
quoted
On Intel processors, the core frequency can be reduced below OS
request, when the current temperature reaches the TCC (Thermal Control
Circuit) activation temperature.

The default TCC activation temperature is specified by
MSR_IA32_TEMPERATURE_TARGET. However, it can be adjusted by
specifying
quoted
an offset in degrees C, using the TCC Offset bits in the same MSR register.

This patch introduces a cooling devices driver that utilizes the TCC
Offset feature. The bigger the current cooling state is, the lower the
effective TCC activation temperature is, so that the processors can be
throttled earlier before system critical overheats.
Thank you for this useful patch.
My systems don't need thermald or any other thermal control, but it is nice
to have this extra margin to add to the critical stuff, as a backup.
I also like to use the offset to test stuff.

I use the internal power limit servo for power limiting, and that servo works
very well indeed. Using this temperature offset as a way to servo the
thermal operating limit does work, but tends to overshoot, oscillate, hold low
excessively long (minutes). 
Do you have a script to test and show the drawbacks of this feature?
It seems that it behaves differently on different platforms.
Maybe we can evaluate this on more platforms.
It also seems to limit CPU clock frequency
reduction to the non-turbo limit, regardless of the desired maximum
temperature.

I am not familiar with the thermal stuff at all, and didn't know where to find
the trip point knob. Anyway, found "cooling_devices11".

I do not understand this:

~$ cat /sys/devices/virtual/thermal/cooling_device11/stats/trans_table
cat: /sys/devices/virtual/thermal/cooling_device11/stats/trans_table: File
too large
This is a known issue that stats table can not handle devices with too many cooling states, say, 127 cooling states for TCC Offset cooling device.
We can ignore this for now.
Rather than enter the actual TCC offset, I would rather enter the desired trip
point, and have the driver do the math to convert it to the offset.
Hmmm, a writable trip point? I need to think about this.
Example step function overshoot, trip point set to 55 degrees C.

doug@s18:~$ sudo ~/turbostat --Summary --quiet --show
Busy%,Bzy_MHz,PkgTmp,PkgWatt,GFXWatt,IRQ --interval 1
Busy%   Bzy_MHz IRQ     PkgTmp  PkgWatt GFXWatt
0.07    800     45      24      1.89    0.00
0.04    800     29      23      1.89    0.00
61.76   4546    4151    66      103.77  0.00 < step function load applied on 4 of 6
cores
67.76   4570    4476    66      120.42  0.00
68.03   4567    4488    66      120.73  0.00
67.98   4572    4492    67      121.00  0.00 < 19 degrees over trip point
68.10   4489    4493    58      109.19  0.00 < this throttling is either the power
servo or the temp servo.
68.08   4262    4476    51      82.82   0.00 < this throttling is the temp servo.
68.13   4143    4513    48      75.16   0.00
68.03   4086    4488    46      71.87   0.00 < It actually undershoots often, I don't
know why.
68.12   4000    4505    46      67.02   0.00 < often it doesn't undershoot.
68.44   4000    4502    45      67.16   0.00
68.06   4000    4483    45      66.95   0.00
68.02   3973    4490    44      65.20   0.00
67.94   3900    4489    43      60.51   0.00
67.88   3900    4501    44      60.55   0.00
67.85   3900    4472    43      60.52   0.00
67.96   3900    4481    43      60.59   0.00
68.26   3900    4501    44      60.70   0.00
67.93   3900    4498    43      60.58   0.00
68.03   3900    4476    43      60.68   0.00
67.83   3900    4481    44      60.54   0.00
35.06   3895    2412    25      32.13   0.00 < load removed.
0.04    800     25      24      1.89    0.00
0.04    800     22      23      1.89    0.00
0.06    800     35      23      1.90    0.00
0.03    800     18      23      1.89    0.00
0.04    800     26      22      1.90    0.00
0.30    1927    44      23      1.97    0.00
^C0.10  800     25      23      1.91    0.00

Example long time to recover:
(actually, this example never recovers, unusual):
Note: 3.7 GHz is the limit.

doug@s18:~$ sudo ~/turbostat --Summary --quiet --show
Busy%,Bzy_MHz,PkgTmp,PkgWatt,GFXWatt,IRQ --interval 30
Busy%   Bzy_MHz IRQ     PkgTmp  PkgWatt GFXWatt
67.58   3700    134812  42      52.15   0.00 <<< the trip point was changed from 37
to 57 degrees
67.90   3700    134964  42      52.08   0.00
68.07   3700    134424  42      52.06   0.00
68.01   3700    134415  41      50.76   0.00
68.14   3700    134521  41      50.78   0.00
68.11   3700    134424  42      50.75   0.00
68.03   3700    134329  42      50.70   0.00
68.11   3700    134321  42      50.76   0.00
68.05   3700    134456  42      51.09   0.00
68.12   3700    134549  42      52.21   0.00
68.12   3700    134482  42      52.19   0.00
68.10   3700    134301  42      52.20   0.00
68.11   3700    134444  42      52.14   0.00
68.08   3700    134422  42      52.17   0.00
68.07   3700    134430  42      52.23   0.00
68.00   3700    134723  42      52.12   0.00
67.96   3711    135207  44      52.53   0.00 <<< It takes 8 minutes until the
frequency goes above 3.7 GHz
68.05   3765    134519  42      54.34   0.00
68.11   3771    134461  43      54.60   0.00
67.83   3763    134867  43      54.26   0.00
67.93   3773    134577  43      54.78   0.00 <<< But it never recovers, Why not?
...

For unknown reason the processor seems to now think it is not heavily
loaded. From my MSR decoder:

0x64F: MSR_CORE_PERF_LIMIT_REASONS: 200020 AUTO AUTOL

From the book:
quoted
Autonomous Utilization-Based Frequency Control Status (R0) When set,
frequency is reduced below the operating system request because the
processor has detected that utilization is low.
Which is not true.

Anyway,

Acked-by: Doug Smythies <redacted>
thanks,
rui
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help