Thread (12 messages) 12 messages, 4 authors, 2014-02-13

Re: Is it ok for deferrable timer wakeup the idle cpu?

From: Viresh Kumar <viresh.kumar@linaro.org>
Date: 2014-02-03 06:51:18
Also in: lkml

Sorry was away for short vacation.

On 28 January 2014 19:20, Frederic Weisbecker [off-list ref] wrote:
On Thu, Jan 23, 2014 at 07:50:40PM +0530, Viresh Kumar wrote:
quoted
Wait, I got the wrong code here. That's wasn't my initial intention.
I actually wanted to write something like this:

 -       wake_up_nohz_cpu(cpu);
 +       if (!tbase_get_deferrable(timer->base) || idle_cpu(cpu))
 +               wake_up_nohz_cpu(cpu);

Will that work?
Something is seriously wrong with me, again wrote rubbish code.
Let me phrase what I wanted to write :)

"don't send IPI to a idle CPU for a deferrable timer."

Probably I code it correctly this time atleast.

-       wake_up_nohz_cpu(cpu);
+       if (!(tbase_get_deferrable(timer->base) && idle_cpu(cpu)))
+               wake_up_nohz_cpu(cpu);
Well, this is going to wake up the target from its idle state, which is
what we want to avoid if the timer is deferrable, right?
Yeah, sorry for doing it for second time :(
The simplest thing we want is:

           if (!tbase_get_deferrable(timer->base) || tick_nohz_full_cpu(cpu))
               wake_up_nohz_cpu(cpu);

This spares the IPI for the common case where the timer is deferrable and we run
in periodic or dynticks-idle mode (which should be 99.99% of the existing workloads).
I wasn't looking at this problem with NO_HZ_FULL in mind. As I thought its
only about if the CPU is idle or not. And so the solution I was
talking about was:

"don't send IPI to a idle CPU for a deferrable timer."

But I see that still failing with the code you wrote. For normal cases where we
don't enable NO_HZ_FULL, we will still end up waking up idle CPUs which
is what Lei Wen reported initially.

Also if a CPU is marked for NO_HZ_FULL and is not idle currently then we
wouldn't send a IPI for a deferrable timer. But we actually need that, so that
we can reevaluate the timers order again?
Then we can later optimize that and spare the IPI on full dynticks CPUs when they run
idle, but that require some special care about subtle races which can't be dealt
with a simple test on "idle_cpu(target)". And power consumption in full dynticks
is already very suboptimized anyway.

So I suggest we start simple with the above test, and a big fat comment which explains
what we are doing and what needs to be done in the future.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help