On Tue, 2016-02-23 at 14:29 +0000, Mel Gorman wrote:
Added a suggested change from Doug Smythies and can add a Signed-off-
by
if Doug is ok with that.
Changelog since v1
o Remove divide that is likely unnecessary (ds
mythies)
o Rebase on top of linux-pm/linux-next
The PID relies on samples of equal time but this does not apply for
deferrable timers when the CPU is idle. intel_pstate checks if the
actual
duration between samples is large and if so, the "busyness" of the
CPU
is scaled.
This assumes the delay was a deferred timer but a workload may simply
have
been idle for a short time if it's context switching between a server
and
client or waiting very briefly on IO. It's compounded by the problem
that
server/clients migrate between CPUs due to wake-affine trying to
maximise
hot cache usage. In such cases, the cores are not considered busy and
the
frequency is dropped prematurely.
This patch increases the hold-off value before the busyness is
scaled. It
was selected based simply on testing until the desired result was
found.
Tests were conducted with workloads that are either client/server
based
or short-lived IO.
Attached specpower comparison for Haswell EP Grantley server.
This workload ran about an hour+.
Difference in OPS:
+1019
Difference in power:
+308.6
Difference in perf/watt -312.479023
So we are consuming 308 Watts on average for doing 1019 operation more.
Thanks,
Srinivas