Re: [PATCH bpf-next v9 07/11] bpf,x86: add fsession support for x86_64
From: Andrii Nakryiko <hidden>
Date: 2026-01-14 19:06:10
Also in:
bpf, lkml
On Tue, Jan 13, 2026 at 7:27 PM Menglong Dong [off-list ref] wrote:
On 2026/1/14 09:25 Andrii Nakryiko [off-list ref] write:quoted
On Sat, Jan 10, 2026 at 6:12 AM Menglong Dong [off-list ref] wrote:quoted
Add BPF_TRACE_FSESSION supporting to x86_64, including:[...]quoted
quoted
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index d94f7038c441..0671a434c00d 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c@@ -3094,12 +3094,17 @@ static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond) static int invoke_bpf(const struct btf_func_model *m, u8 **pprog, struct bpf_tramp_links *tl, int stack_size, int run_ctx_off, bool save_ret, - void *image, void *rw_image) + void *image, void *rw_image, u64 func_meta) { int i; u8 *prog = *pprog; for (i = 0; i < tl->nr_links; i++) { + if (tl->links[i]->link.prog->call_session_cookie) { + /* 'stack_size + 8' is the offset of func_md in stack */not func_md, don't invent new names, "func_meta" (but it's also soAh, it should be func_meta here, it's a typo.quoted
backwards that you have stack offsets as positive... and it's not even in verifier's stack slots, just bytes... very confusing to me)Do you mean the offset to emit_store_stack_imm64()? I'll convert it to negative after modify the emit_store_stack_imm64() as you suggested.
yes
quoted
quoted
+ emit_store_stack_imm64(&prog, stack_size + 8, func_meta); + func_meta -= (1 << BPF_TRAMP_M_COOKIE);was this supposed to be BPF_TRAMP_M_IS_RETURN?... and why didn't AI catch this?It should be BPF_TRAMP_M_COOKIE here. I'm decreasing and compute the offset of the session cookie for the next bpf program. This part correspond to the 5th patch. It will be more clear if you combine it to the 5th patch. Seems that it's a little confusing here :/
It is confusing. And invoke_bpf is partly provided with opaque func_meta, but also partly knows its structure and does extra adjustments, I don't like it. I think it would be simpler to just pass nr_args and cookies_offset and let invoke_bpf construct func_meta for each program invocation, IMO.
Maybe some comment is needed here.quoted
quoted
+ } if (invoke_bpf_prog(m, &prog, tl->links[i], stack_size, run_ctx_off, save_ret, image, rw_image)) return -EINVAL;@@ -3222,7 +3227,9 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT]; struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN]; void *orig_call = func_addr; + int cookie_off, cookie_cnt; u8 **branches = NULL; + u64 func_meta; u8 *prog; bool save_ret;@@ -3290,6 +3297,11 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im ip_off = stack_size; + cookie_cnt = bpf_fsession_cookie_cnt(tlinks); + /* room for session cookies */ + stack_size += cookie_cnt * 8; + cookie_off = stack_size; + stack_size += 8; rbx_off = stack_size;@@ -3383,9 +3395,19 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im } } + if (bpf_fsession_cnt(tlinks)) { + /* clear all the session cookies' value */ + for (int i = 0; i < cookie_cnt; i++) + emit_store_stack_imm64(&prog, cookie_off - 8 * i, 0); + /* clear the return value to make sure fentry always get 0 */ + emit_store_stack_imm64(&prog, 8, 0); + } + func_meta = nr_regs + (((cookie_off - regs_off) / 8) << BPF_TRAMP_M_COOKIE);func_meta conceptually is a collection of bit fields, so using +/- feels weird, use | and &, more in line with working with bits?It's not only for bit fields. For nr_args and cookie offset, they are byte fields. Especially for cookie offset, arithmetic operation is performed too. So I think it make sense here, right?quoted
(also you defined that BPF_TRAMP_M_NR_ARGS but you are not using it consistently...)I'm not sure if we should define it. As we use the least significant byte for the nr_args, the shift for it is always 0. If we use it in the inline, unnecessary instruction will be generated, which is the bit shift instruction. I defined it here for better code reading. Maybe we can do some comment in the inline of bpf_get_func_arg(), instead of defining such a unused macro?
I think I just wouldn't define NR_ARGS macro at all then, given inline implementation implicitly encodes that knowledge anyways.
Thanks! Menglong Dongquoted
quoted
+ if (fentry->nr_links) { if (invoke_bpf(m, &prog, fentry, regs_off, run_ctx_off, - flags & BPF_TRAMP_F_RET_FENTRY_RET, image, rw_image)) + flags & BPF_TRAMP_F_RET_FENTRY_RET, image, rw_image, + func_meta)) return -EINVAL; }@@ -3445,9 +3467,14 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im } } + /* set the "is_return" flag for fsession */ + func_meta += (1 << BPF_TRAMP_M_IS_RETURN); + if (bpf_fsession_cnt(tlinks)) + emit_store_stack_imm64(&prog, nregs_off, func_meta); + if (fexit->nr_links) { if (invoke_bpf(m, &prog, fexit, regs_off, run_ctx_off, - false, image, rw_image)) { + false, image, rw_image, func_meta)) { ret = -EINVAL; goto cleanup; } --2.52.0