Re: [PATCH v4 3/9] bpf/btf: Add a function to search a member of a struct/union
From: Florent Revest <hidden>
Date: 2023-08-03 16:38:15
Also in:
bpf, lkml
On Thu, Aug 3, 2023 at 5:42 PM Masami Hiramatsu [off-list ref] wrote:
On Wed, 2 Aug 2023 16:44:09 +0200 Florent Revest [off-list ref] wrote:quoted
On Wed, Aug 2, 2023 at 1:09 AM Steven Rostedt [off-list ref] wrote:quoted
On Tue, 1 Aug 2023 15:18:56 -0700 Alexei Starovoitov [off-list ref] wrote:quoted
On Tue, Aug 1, 2023 at 8:32 AM Steven Rostedt [off-list ref] wrote:quoted
On Tue, 1 Aug 2023 11:20:36 -0400 Steven Rostedt [off-list ref] wrote:quoted
The solution was to come up with ftrace_regs, which just means it has all the registers to extract the arguments of a function and nothing more. MostThis isn't 100% true. The ftrace_regs may hold a fully filled pt_regs. As the FTRACE_WITH_REGS callbacks still get passed a ftrace_regs pointer. They will do: void callback(..., struct ftrace_regs *fregs) { struct pt_regs *regs = ftrace_get_regs(fregs); Where ftrace_get_regs() will return the pt_regs only if it is fully filled. If it is not, then it returns NULL. This was what the x86 maintainers agreed with.arch/arm64/include/asm/ftrace.h:#define arch_ftrace_get_regs(regs) NULL Ouch. That's very bad. We care a lot about bpf running well on arm64.[ Adding Mark and Florent ]Ah, thanks Steve! That's my favorite can of worms :) I actually consider sending a talk proposal to the tracing MC at LPC "pt_regs - the good the bad and the ugly" on this very topic because I care about unblocking BPF "multi_kprobe" (which really is fprobe) on arm64, maybe it would be interesting.Ah, it is almost same as my talk :)
Oh, I didn't know! I submitted a proposal today but if the talks have a lot of overlap maybe it's best that only you give your talk, since you're the actual maintainer :) or we could co-present if there's something I could add but I think you have all the background anyway
quoted
I pointed this out in https://lore.kernel.org/all/CABRcYm+esb8J2O1v6=C+h+HSa5NxraPUgo63w7-iZj0CXbpusg@mail.gmail.com/#t (local) when Masami proposed adding calls from fprobe to perf. If every subsystem makes different assumptions about "how sparse" their pt_regs is and they call into one another, this could lead to... interesting bugs. (eg: currently, we don't populate a fake pstate in ftrace_regs. so we'd need to fake it when creating a sparse pt_regs _for Perf_, knowing that Perf specifically expects this reg to be set. this would require a struct copy anyway and some knowledge about how the data will be consumed, in an arch- and subsystem- specific way)yeah, sorry I missed that point. I should remove it until we can fix it.
Uh, I shouldn't have buried my important comments so far down the email :/ I wasn't sure whether you had missed the paragraph.
quoted
On the other hand, untangling all code paths that come from trampolines (with a light regs structure) from those that come from an exception (with a pt_regs) could lead to a lot of duplicated code, and converting between each subsystem's idea of a light regs structure (what if perf introduces a perf_regs now ?) would be tedious and slow (lots of copies ?).This is one discussion point I think. Actually, using pt_regs in kretprobe (and rethook) is histrical accident. Originally, it had put a kprobe on the function return trampoline to hook it. So keep the API compatiblity I made the hand assembled code to save the pt_regs on the stack. My another question is if we have the fprobe to trace (hook) the function return, why we still need the kretprobe itself. I think we can remove kretprobe and use fprobe exit handler, because "function" probing will be done by fprobe, not kprobe. And then, we can simplify the kprobe interface and clarify what it is -- "kprobe is a wrapper of software breakpoint". And we don't need to think about duplicated code anymore :)
That sounds reasonable to me
As I said, I would like to phase out the kretprobe itself because it provides the same feature of fprobe, which is confusing. jprobe was removed a while ago, and now kretprobe is. But we can not phase out it at once. So I think we will keep current kretprobe trampoline on arm64 and just add new ftrace_regs based rethook. Then remove the API next release. (after all users including systemtap is moved)
Heads up to BPF folks though since they also have BPF "kretprobe" program types which would break in a similar fashion as multi_kprobe (even though BPF kretprobe programs have also been discouraged for a while in favor of BPF fexit programs)
quoted
quoted
The reason I started the FTRACE_WITH_ARGS (which gave us ftrace_regs) in the first place, was because of the overhead you reported to me with ftrace_regs_caller and why you wanted to go the direct trampoline approach. That's when I realized I could use a subset because those registers were already being saved. The only reason FTRACE_WITH_REGS was created was it had to supply full pt_regs (including flags) and emulate a breakpoint for the kprobes interface. But in reality, nothing really needs all that.quoted
It's not about access to args. pt_regs is passed from bpf prog further into all kinds of perf event functions including stack walking.If all accesses are done in BPF bytecode, we could (theoretically) have the verifier and JIT work together to deny accesses to unpopulated fields, or relocate pt_regs accesses to ftrace_regs accesses to keep backward compatibility with existing multi_kprobe BPF programs.Yeah, that is what I would like to suggest, and what my patch does. (let me update rethook too, it'll be a bit tricky since I don't want break anything)
I agree with Alexei that this is an unnecessary amount of complexity in the verifier just to avoid a struct copy though. It's good to know that we _could_ do it if we really need to someday but then again, if a user chooses an interface that gets a pt_regs they shouldn't expect high performance. Therefore, I think it's ok for BPF multi_kprobe to copy fields from a ftrace_regs to a pt_regs on stack, especially if it avoids so much additional complexity in the verifier.