Thread (23 messages) 23 messages, 4 authors, 2021-09-15

Re: [PATCH] cpufreq: intel_pstate: Force intel_pstate to load when HWP disabled in firmware

From: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Date: 2021-05-13 11:04:04
Also in: lkml

On Thu, 2021-05-13 at 12:10 +0200, Giovanni Gherdovich wrote:
On Thu, 2021-05-13 at 02:24 -0700, Srinivas Pandruvada wrote:
quoted
On Thu, 2021-05-13 at 09:59 +0200, Giovanni Gherdovich wrote:
quoted
On CPUs succeeding SKX, eg. ICELAKE_X, intel_pstate doesn't load
unless
CPUID advertises support for the HWP feature. Some OEMs, however,
may
offer
users the possibility to disable HWP from the BIOS config utility
by
altering the output of CPUID.
Is someone providing a utility? What is the case for broken HWP?
Yes, I know of at least one server manufacturer that ships a BIOS
config
utility where the user can disable HWP.

On such server machine, which has an ICELAKE_X CPU, if the user
unchecks HWP
via BIOS then intel_pstate will refuse to load saying:

    intel_pstate: CPU model not supported

because ICELAKE_X is not in the list intel_pstate_cpu_ids (defined in
intel_pstate.c) of CPUs that intel_pstate supports when HWP is absent
from
CPUID; that list ends at SKYLAKE_X.

An alternative approach to register intel_pstate in the case I'm
describing
would be to add ICELAKE_X (and every CPU model after that, forever?)
to the
list intel_pstate_cpu_ids.
This is not nice, but unlike client server CPUs don't get released
often. There is couple of years in between.
quoted
It is possible that some user don't want to use HWP, because there
workloads works better without HWP. But that doesn't mean HWP is
broken.
That's true, a user may legitimate want to disable HWP, and we have
the
intel_pstate=no_hwp option for that. But for that option to work
CPUID must
still show that the CPU is HWP-capable; when disablement happens in
BIOS, it's
not the case.
Correct.
The wording "hwp_broken_firmware" deliberately has a negative
connotation (the
intended meaning is: "firmware is broken, regarding HWP"), carrying
the
not-so-subtle message "OEM folks, please don't do this". My
understanding is
that the preferred way to disable HWP is with intel_pstate=no_hwp,
the
firmware should stay out of it.
For me "broken" means that Intel has some bug, which is not the case,
even if the intention is to carry message to OEM.

no_hwp is for disabling HWP even if the HWP is supported.

The problem is that if we override the supported CPU list using some
kernel command line, some users may crash the system running on some
old hardware where some of the MSRs we rely are not present. We don't
read MSR in failsafe mode, so they will fault. We are checking some
MSRs but not all. Also what will be default struct pstate_funcs *)id-
driver_data if the cpu model doesn't match.
I think better to add CPU model instead. We did that for SKX on user
requests.

Thanks,
Srinivas
I hope this clarifies the problem (there is an ICELAKE_X somewhere
out there
that can't load intel_pstate, which is not nice) and the intention
(discouraging disablement of HWP via firmware).


Giovanni
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help