Re: [BUG] no ORC stacktrace from kretprobe.multi bpf program
From: Steven Rostedt <rostedt@goodmis.org>
Date: 2025-10-22 14:27:59
Also in:
bpf
On Wed, 22 Oct 2025 14:32:19 +0200 Jiri Olsa [off-list ref] wrote:
thanks for the report.. so above is from arm? yes the x86_64 starts with: unwind_start(&state, current, NULL, (void *)regs->sp); I seems to get reasonable stack traces on x86 with the change below, which just initializes fields in regs that are used later on and sets the stack so the ftrace_graph_ret_addr code is triggered during unwind but I'm not familiar with this code, Masami, Josh, any idea?
Oh! This is an issue with a stack trace happening from a callback of the exit handler? OK, that makes much more sense. As I don't think the code handles that properly.
quoted hunk ↗ jump to hunk
thanks, jirka ---diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S index 367da3638167..2d2bb8c37b56 100644 --- a/arch/x86/kernel/ftrace_64.S +++ b/arch/x86/kernel/ftrace_64.S@@ -353,6 +353,8 @@ STACK_FRAME_NON_STANDARD_FP(__fentry__) SYM_CODE_START(return_to_handler) UNWIND_HINT_UNDEFINED
I believe the above UNWIND_HINT_UNDEFINED means that if ORC were to hit this, it should just give up. This is because tracing the exit of the function really doesn't fit in the normal execution paradigm. The entry is easy. It's the same as if the callback was called by the function being traced. The exit is more difficult because the function being traced has already did its return. Now the callback is in this limbo area of being called between a return and the caller.
ANNOTATE_NOENDBR + push $return_to_handler + UNWIND_HINT_FUNC
OK, so what happened here is that you put in the return_to_handle into the stack and told ORC that this is a normal function, and that when it triggers to do a lookup from the handler itself. I wonder if we could just add a new UNWIND_HINT that tells ORC to do that?
quoted hunk ↗ jump to hunk
/* Save ftrace_regs for function exit context */ subq $(FRAME_SIZE), %rsp@@ -360,6 +362,9 @@ SYM_CODE_START(return_to_handler) movq %rax, RAX(%rsp) movq %rdx, RDX(%rsp) movq %rbp, RBP(%rsp) + movq %rsp, RSP(%rsp) + movq $0, EFLAGS(%rsp) + movq $__KERNEL_CS, CS(%rsp)
Is this simulating some kind of interrupt?
movq %rsp, %rdi call ftrace_return_to_handler
Now it gets tricky in the ftrace_return_to_handler as the first thing it does is to pop the shadow stack, which makes the return_to_handler lookup different, as its no longer on the stack that the unwinder will use. The return address will live in the "ret" variable of that function, which the unwinder will not have access to. Yeah, this will not be easy to solve. -- Steve
quoted hunk ↗ jump to hunk
@@ -368,7 +373,8 @@ SYM_CODE_START(return_to_handler) movq RDX(%rsp), %rdx movq RAX(%rsp), %rax - addq $(FRAME_SIZE), %rsp + addq $(FRAME_SIZE) + 8, %rsp + /* * Jump back to the old return address. This cannot be JMP_NOSPEC rdi * since IBT would demand that contain ENDBR, which simply isn't so for