Re: [PATCH v4 3/9] bpf/btf: Add a function to search a member of a struct/union
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Date: 2023-08-02 14:08:37
Also in:
bpf, lkml
On Tue, 1 Aug 2023 19:22:01 -0700 Alexei Starovoitov [off-list ref] wrote:
On Tue, Aug 1, 2023 at 5:44 PM Steven Rostedt [off-list ref] wrote:quoted
On Tue, 1 Aug 2023 20:40:54 -0400 Steven Rostedt [off-list ref] wrote:quoted
Maybe we can add a ftrace_partial_regs(fregs) that returns a partially filled pt_regs, and the caller that uses this obviously knows its partial (as it's in the name). But this doesn't quite help out arm64 because unlike x86, struct ftrace_regs does not contain an address compatibility with pt_regs fields. It would need to do a copy. ftrace_partial_regs(fregs, ®s) ?Well, both would be pointers so you wouldn't need the "&", but it was to stress that it would be copying one to the other. void ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs regs);Copy works, but why did you pick a different layout?
I think it is for minimize the stack consumption. pt_regs on arm64 will consume 42*u64 = 336 bytes, on the other hand ftrace_regs will use 14*unsigned long = 112 bytes. And most of the registers in pt_regs are not accessed usually. (as you may know RISC processors usually have many registers - and x86 will be if we use APX in kernel. So pt_regs is big.)
Why not to use pt_regs ? if save of flags is slow, just skip that part and whatever else that is slow. You don't even need to zero out unsaved fields. Just ask the caller to zero out pt_regs before hand. Most users have per-cpu pt_regs that is being reused. So there will be one zero-out in the beginning and every partial save of regs will be fast. Then there won't be any need for copy-converter from ftrace_regs to pt_regs. Maybe too much churn at this point. copy is fine.
If there is no nested call, yeah, per-cpu pt_regs will work. Thank you, -- Masami Hiramatsu (Google) [off-list ref]