Re: [RFC PATCH v3 11/11] powerpc64/bpf: Add support for bpf trampolines
From: Naveen N Rao <naveen@kernel.org>
Date: 2024-07-01 20:00:16
Also in:
bpf, linuxppc-dev
On Mon, Jul 01, 2024 at 09:03:52PM GMT, Nicholas Piggin wrote:
On Fri Jun 21, 2024 at 5:09 AM AEST, Naveen N Rao wrote:quoted
Add support for bpf_arch_text_poke() and arch_prepare_bpf_trampoline() for 64-bit powerpc.What do BPF trampolines give you?
At a very basic level, they provide a way to attach bpf programs at
function entry/exit - as an alternative to ftrace/kprobe - in a more
optimal manner. Commit fec56f5890d9 ("bpf: Introduce BPF trampoline")
has more details.
quoted
BPF prog JIT is extended to mimic 64-bit powerpc approach for ftrace having a single nop at function entry, followed by the function profiling sequence out-of-line and a separate long branch stub for calls to trampolines that are out of range. A dummy_tramp is provided to simplify synchronization similar to arm64.Synrhonization - between BPF and ftrace interfaces?quoted
BPF Trampolines adhere to the existing ftrace ABI utilizing a two-instruction profiling sequence, as well as the newer ABI utilizing a three-instruction profiling sequence enabling return with a 'blr'. The trampoline code itself closely follows x86 implementation. While the code is generic, BPF trampolines are only enabled on 64-bit powerpc. 32-bit powerpc will need testing and some updates. Signed-off-by: Naveen N Rao <naveen@kernel.org>Just a quick glance for now, and I don't know BPF code much.quoted
--- arch/powerpc/include/asm/ppc-opcode.h | 14 + arch/powerpc/net/bpf_jit.h | 11 + arch/powerpc/net/bpf_jit_comp.c | 702 +++++++++++++++++++++++++- arch/powerpc/net/bpf_jit_comp32.c | 7 +- arch/powerpc/net/bpf_jit_comp64.c | 7 +- 5 files changed, 738 insertions(+), 3 deletions(-)diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h index 076ae60b4a55..9eaa2c5d9b73 100644 --- a/arch/powerpc/include/asm/ppc-opcode.h +++ b/arch/powerpc/include/asm/ppc-opcode.h@@ -585,12 +585,26 @@ #define PPC_RAW_MTSPR(spr, d) (0x7c0003a6 | ___PPC_RS(d) | __PPC_SPR(spr)) #define PPC_RAW_EIEIO() (0x7c0006ac) +/* bcl 20,31,$+4 */ +#define PPC_RAW_BCL() (0x429f0005)This is the special bcl form that gives the current address. Maybe call it PPC_RAW_BCL4()
Sure.
quoted
+void dummy_tramp(void); + +asm ( +" .pushsection .text, \"ax\", @progbits ;" +" .global dummy_tramp ;" +" .type dummy_tramp, @function ;" +"dummy_tramp: ;" +#ifdef CONFIG_FTRACE_PFE_OUT_OF_LINE +" blr ;" +#else +" mflr 11 ;"Can you just drop this instruction? The caller will always have it in r11?
Indeed. Will add a comment and remove the instruction.
quoted
+" mtctr 11 ;" +" mtlr 0 ;" +" bctr ;" +#endif +" .size dummy_tramp, .-dummy_tramp ;" +" .popsection ;" +); + +void bpf_jit_build_fentry_stubs(u32 *image, struct codegen_context *ctx) +{ + int ool_stub_idx, long_branch_stub_idx; + + /* + * Out-of-line stub: + * mflr r0 + * [b|bl] tramp + * mtlr r0 // only with CONFIG_FTRACE_PFE_OUT_OF_LINE + * b bpf_func + 4 + */ + ool_stub_idx = ctx->idx; + EMIT(PPC_RAW_MFLR(_R0)); + EMIT(PPC_RAW_NOP()); + if (IS_ENABLED(CONFIG_FTRACE_PFE_OUT_OF_LINE)) + EMIT(PPC_RAW_MTLR(_R0)); + WARN_ON_ONCE(!is_offset_in_branch_range(4 - (long)ctx->idx * 4)); /* TODO */ + EMIT(PPC_RAW_BRANCH(4 - (long)ctx->idx * 4)); + + /* + * Long branch stub: + * .long <dummy_tramp_addr> + * mflr r11 + * bcl 20,31,$+4 + * mflr r12 + * ld r12, -8-SZL(r12) + * mtctr r12 + * mtlr r11 // needed to retain ftrace ABI + * bctr + */You could avoid clobbering LR on >= POWER9 with addpcis instruction. Or use a pcrel load with pcrel even. I guess that's something to do later.
Yes, much of BPF JIT could use a re-look to consider opportunities to emit prefix instructions. Thanks, Naveen