Thread (30 messages) 30 messages, 5 authors, 2025-11-06

Re: [RESEND PATCH v7 7/7] cpuidle/poll_state: Poll via smp_cond_load_relaxed_timeout()

From: "Rafael J. Wysocki" <rafael@kernel.org>
Date: 2025-10-29 18:53:15
Also in: bpf, linux-arch, linux-pm, lkml

On Wed, Oct 29, 2025 at 5:42 AM Ankur Arora [off-list ref] wrote:

Rafael J. Wysocki [off-list ref] writes:
quoted
On Tue, Oct 28, 2025 at 6:32 AM Ankur Arora [off-list ref] wrote:
quoted
The inner loop in poll_idle() polls over the thread_info flags,
waiting to see if the thread has TIF_NEED_RESCHED set. The loop
exits once the condition is met, or if the poll time limit has
been exceeded.

To minimize the number of instructions executed in each iteration,
the time check is done only intermittently (once every
POLL_IDLE_RELAX_COUNT iterations). In addition, each loop iteration
executes cpu_relax() which on certain platforms provides a hint to
the pipeline that the loop busy-waits, allowing the processor to
reduce power consumption.

This is close to what smp_cond_load_relaxed_timeout() provides. So,
restructure the loop and fold the loop condition and the timeout check
in smp_cond_load_relaxed_timeout().
Well, it is close, but is it close enough?
I guess that's the question.
quoted
quoted
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Daniel Lezcano <redacted>
Signed-off-by: Ankur Arora <redacted>
---
 drivers/cpuidle/poll_state.c | 29 ++++++++---------------------
 1 file changed, 8 insertions(+), 21 deletions(-)
diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
index 9b6d90a72601..dc7f4b424fec 100644
--- a/drivers/cpuidle/poll_state.c
+++ b/drivers/cpuidle/poll_state.c
@@ -8,35 +8,22 @@
 #include <linux/sched/clock.h>
 #include <linux/sched/idle.h>

-#define POLL_IDLE_RELAX_COUNT  200
-
 static int __cpuidle poll_idle(struct cpuidle_device *dev,
                               struct cpuidle_driver *drv, int index)
 {
-       u64 time_start;
-
-       time_start = local_clock_noinstr();
+       u64 time_end;
+       u32 flags = 0;

        dev->poll_time_limit = false;

+       time_end = local_clock_noinstr() + cpuidle_poll_time(drv, dev);
Is there any particular reason for doing this unconditionally?  If
not, then it looks like an arbitrary unrelated change to me.
Agreed. Will fix.
quoted
quoted
+
        raw_local_irq_enable();
        if (!current_set_polling_and_test()) {
-               unsigned int loop_count = 0;
-               u64 limit;
-
-               limit = cpuidle_poll_time(drv, dev);
-
-               while (!need_resched()) {
-                       cpu_relax();
-                       if (loop_count++ < POLL_IDLE_RELAX_COUNT)
-                               continue;
-
-                       loop_count = 0;
-                       if (local_clock_noinstr() - time_start > limit) {
-                               dev->poll_time_limit = true;
-                               break;
-                       }
-               }
+               flags = smp_cond_load_relaxed_timeout(&current_thread_info()->flags,
+                                                     (VAL & _TIF_NEED_RESCHED),
+                                                     (local_clock_noinstr() >= time_end));
So my understanding of this is that it reduces duplication with some
other places doing similar things.  Fair enough.

However, since there is "timeout" in the name, I'd expect it to take
the timeout as an argument.
The early versions did have a timeout but that complicated the
implementation significantly. And the current users poll_idle(),
rqspinlock don't need a precise timeout.

smp_cond_load_relaxed_timed(), smp_cond_load_relaxed_timecheck()?

The problem with all suffixes I can think of is that it makes the
interface itself nonobvious.

Possibly something with the sense of bail out might work.
It basically has two conditions, one of which is checked in every step
of the internal loop and the other one is checked every
SMP_TIMEOUT_POLL_COUNT steps of it.  That isn't particularly
straightforward IMV.

Honestly, I prefer the existing code.  It is much easier to follow and
I don't see why the new code would be better.  Sorry.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help