Thread (17 messages) 17 messages, 3 authors, 15d ago

Re: [PATCH] rethook: Use tsk->on_cpu to check task execution state

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Date: 2026-06-09 04:41:59
Also in: bpf, lkml

On Mon, 8 Jun 2026 16:06:54 +0200
Peter Zijlstra [off-list ref] wrote:
On Mon, Jun 08, 2026 at 10:08:11PM +0900, Masami Hiramatsu wrote:
quoted
quoted
quoted
Anyway, I'm wondering what the purpose of this check here is, there is
no real comment, and commit 5120d167e21c ("rethook: Remove warning
messages printed for finding return address of a frame.") is just pure
voodoo as well.
FWIW, you should have had this discussion then.
Indeed. The rethook is making a shadow stack by list, thus caller must
guarantee the target process is blocked at least during this function.

The commit messages suggest that when BPF takes a backtrace, it also
includes other running tasks. Is that safe?
Well, you get to keep the pieces. At this point safe only pertains to
'doesn't-crash', all correctness is out the window.

I always forget the crazy BPF does ;-)
quoted
quoted
quoted
Also, note the comment that goes with the usage of
task_on_another_cpu(); that thing is racy as all heck.

So it really comes down to what the purpose of this check is.
This check has been introduced when it is copied from
kretprobe_find_ret_addr(). It has the comment:

 * The @tsk must be 'current' or a task which is not running. @fp is a hint

IIRC, I added this check to explicitly verify this condition.
Right, but it is a prescriptive comment, not an explanatory one. That
is, it doesn't explain the condition.
quoted
quoted
quoted
I suspect the issue at hand is that tsk->rethook elements, such as
iterated by __rethook_find_ret_addr() are not safe to be accessed for a
running task.

Notably while rethook_recycle() has some RCU thing on, that objpool
thing (and the recycle name itself) seems to strongly suggest iterating
these things is not sound (you could start with things from this task,
hit a recycled entry and continue iterating rethooks from another task).

Also note that the current check is also racy, nothing really prevents a
wakeup from happening right after you observe task_is_running() being
false. The task can then get scheduled in on another CPU and tear down
its rethooks concurrent with __rethook_find_ret_addr().
Yeah, but is there any way to ensure the task is blocked? Even if it is
blocked, like TASK_UNINTERRUPTIBLE, unless holding the actual lock in
the rethook, it may not be possible to ensure it?

Of course, we could give up on checking within this function and leave
everything to the caller to guarantee - as kretprobe does.

BTW, the reason why we made it possible to pass tasks other than current
 is that the stack unwinding code itself supported unwinding tasks other
than current, so we had no choice but to create this interface.

However, it is a bad idea to check this in deep inside of unwinding.
This, you cannot take locks in unwinding. The only thing you can do is
try to do the best you can without crashing.

Typically unwind only happens on self -- this is natural, a task crashes
and unwinds itself, or a task does something (takes a lock, hits a
tracepoint, etc) and takes a snapshot of its own stack, and this is
safe.

Things like live-patch use task_call_func(), which ensures the callback
function is done while holding sufficient locks for the task to not
change state.
Hmm, is there any way to ensure the function is called from task_call_func()?
(Maybe checking p->pi_lock, but this is not sure the lock owner is this
context?) If not, I need to make this available only for current task
(anyway it just return kretprobe trampoline address, no critical issue)
or, introduce a spinlock.

Or, eventually it may be better to replace kretprobe/rethook with
fprobe return handler.

Thank you,

-- 
Masami Hiramatsu (Google) [off-list ref]
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help