Thread (13 messages) 13 messages, 3 authors, 2018-03-27

Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()

From: "Rafael J. Wysocki" <rafael@kernel.org>
Date: 2018-03-26 21:44:31
Also in: lkml

On Mon, Mar 26, 2018 at 6:32 PM, Rik van Riel [off-list ref] wrote:
On Sun, 2018-03-25 at 23:34 +0200, Rafael J. Wysocki wrote:
quoted
On Sunday, March 25, 2018 10:15:52 PM CEST Rik van Riel wrote:
quoted
I plan to try two more things:

1) Disable polling on SMT systems, with
   the idea that putting one thread to
   sleep with monitor/mwait in C1 will
   allow the other thread to run faster.
Sounds plausible.
Plausible, but wrong. Tests showed that CPU use
during the peak load of this test increased from
about 71% to about 78% with this change, or just
under 10% increase relative to the baseline.

Coincidentally, that is the same CPU use increase
I have seen with the poll_idle() changes. Not sure
if that means anything...
quoted
quoted
2) Insert more cpu_relax() calls into the
   main loop, so the CPU core spends more
   of its time in cpu_relax() and less
   time doing other things:
Well, maybe it's a matter of doing cpu_relax() between any other bits
of
significant computation in there:
I tried that, as well, and some other variations.

Every single change to poll_idle() that I tried
seems to result in a 9-10% relative increase in
CPU use during the peak load of the test.

During the busiest parts of the load, every CPU
sees on the order of 20k context switches a second.
Hmm.  Basically, you are saying that

while (something)
    cpu_relax();

is measurably less overhead than

while (something) {
    cpu_relax();
    check something else;
    cpu_relax();
}

which honestly makes me wonder how this is possible at all.

Unless, that is, the Doug's theory actually matches the reality and
the difference is related to the fact that the "something else" causes
the loop to terminate after fewer steps.

Which in turn causes poll_idle() to return to the outer loop in
do_idle() and *that* is enough overhead to explain what you are
seeing.

So, it looks like after the change causing the poll_idle() loop to
terminate in fewer steps (which is generally good for power and also
for performance in some workloads) you are measuring the overhead of
the outer loop in do_idle() instead of the overhead of the loop in
poll_idle() itself.

With that in mind (and I'm going to assume that this is the case until
I see convincing evidence to the contrary), it seems to me that your
workload will be unhappy if *any* changes are made to poll_idle(), but
we simply have to change it, because today it is kind of a design
disaster.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help