Thread (17 messages) 17 messages, 3 authors, 17d ago

Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Date: 2026-06-08 02:56:50
Also in: lkml

On Mon, 8 Jun 2026 09:52:37 +0800
Tengda Wu [off-list ref] wrote:

On 2026/6/5 21:43, Masami Hiramatsu wrote:
quoted
On Thu, 4 Jun 2026 11:34:45 +0200
Peter Zijlstra [off-list ref] wrote:
quoted
On Mon, Jun 01, 2026 at 08:40:01AM +0900, Masami Hiramatsu wrote:
quoted
Peter, is it OK to drop @rq from task_on_cpu()? 
Sure.
quoted
Then we can use it from rethook.
Well, it is in sched/sched.h, which is an internal header, and no you
cannot use that header in rethook.
Ah, OK. Hmm, then we should not use it. Maybe ->on_cpu is also internal
state?
quoted
But lets step back first, what is the actual problem here, why are we
looking at ->on_cpu at all?
Tengda, can you explain it?
I think you want to take a stacktrace on !current process, and
rethook_find_ret_addr() is rejected i the task is running state.

But if you can share actual situation what you need, it is
helpful for us to understand.

Thank you,

Sure.

Background: We are verifying the support of live patches for functions that
have a kretprobe. The specific verification method is as follows:

We construct a function foo() that calls bar():

void bar(void)
{
    for (;;) {
        schedule();
    }
}

void foo(void)
{
    bar();
}

A kretprobe is attached to bar():

echo 'r:rp1 bar' > /sys/kernel/tracing/kprobe_events
echo 1 > /sys/kernel/tracing/events/kprobes/rp1/enable

Then foo() is triggered. The expected behavior is that bar() will call
schedule() and yield the CPU.

After that, the live patch is activated to attempt replacing the implementation
of foo(). The expectation is that this should succeed.

However, in reality, because the task that called schedule() is still in the
RUNNING state, the condition task_is_running(tsk) inside rethook_find_ret_addr()
is not satisfied, causing the function to return early. This, in turn,
prevents stack_trace_save_tsk_reliable() from determining the stack as
reliable, leading to a failure in activating the live patch.
Hmm is the bar() doing infinite loop, or limited loop but take a long time
so just yield a while? Anyway, it seems like a non-good design pattern.
Is it possible to avoid busy loops and instead use Workers, or wait for
something to complete or for input within a loop?
**Not sure if this is correct:**

We believe that after a task voluntarily calls schedule(), when the stack
is expected to be reliable, it is a safe time to activate a live patch.
In this case, I don't know how to block the loop inside the bar.
Even if !tsk->on_cpu, the tsk can restart running right after checking
the flag.
Additionally, a similar tsk->on_cpu check can be found elsewhere in the
kernel (See task_on_another_cpu() in arch/x86/include/asm/unwind.h).
Therefore, we propose changing the task_is_running(tsk) condition to
tsk->on_cpu.
Yes, but the caller said there is another check to ensure the race.

        /*                  
         * Refuse to unwind the stack of a task while it's executing on another
         * CPU.  This check is racy, but that's ok: the unwinder has other
         * checks to prevent it from going off the rails.
         */
        if (task_on_another_cpu(task))
                goto err;

Josh, do you know how this avoid the race case?

Thank you,
Thanks,
Tengda

-- 
Masami Hiramatsu (Google) [off-list ref]
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help