Re: [PATCH 4/9] powerpc/bpf: Handle large branch ranges with BPF_EXIT
From: Naveen N. Rao <hidden>
Date: 2022-01-07 11:46:35
Also in:
bpf
Christophe Leroy wrote:
Le 04/10/2021 à 20:24, Naveen N. Rao a écrit :quoted
Christophe Leroy wrote:quoted
Le 01/10/2021 à 23:14, Naveen N. Rao a écrit :quoted
In some scenarios, it is possible that the program epilogue is outside the branch range for a BPF_EXIT instruction. Instead of rejecting such programs, emit an indirect branch. We track the size of the bpf program emitted after the initial run and do a second pass since BPF_EXIT can end up emitting different number of instructions depending on the program size. Suggested-by: Jordan Niethe <redacted> Signed-off-by: Naveen N. Rao <redacted> --- arch/powerpc/net/bpf_jit.h | 3 +++ arch/powerpc/net/bpf_jit_comp.c | 22 +++++++++++++++++++++- arch/powerpc/net/bpf_jit_comp32.c | 2 +- arch/powerpc/net/bpf_jit_comp64.c | 2 +- 4 files changed, 26 insertions(+), 3 deletions(-)diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h index 89bd744c2bffd4..4023de1698b9f5 100644 --- a/arch/powerpc/net/bpf_jit.h +++ b/arch/powerpc/net/bpf_jit.h@@ -126,6 +126,7 @@#define SEEN_FUNC 0x20000000 /* might call external helpers */ #define SEEN_TAILCALL 0x40000000 /* uses tail calls */ +#define SEEN_BIG_PROG 0x80000000 /* large prog, >32MB */ #define SEEN_VREG_MASK 0x1ff80000 /* Volatile registers r3-r12 */ #define SEEN_NVREG_MASK 0x0003ffff /* Non volatile registers r14-r31 */@@ -179,6 +180,8 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32*image, struct codegen_context * void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx); void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx); void bpf_jit_realloc_regs(struct codegen_context *ctx); +int bpf_jit_emit_exit_insn(u32 *image, struct codegen_context *ctx, + int tmp_reg, unsigned long exit_addr); #endifdiff --git a/arch/powerpc/net/bpf_jit_comp.cb/arch/powerpc/net/bpf_jit_comp.c index fcbf7a917c566e..3204872fbf2738 100644--- a/arch/powerpc/net/bpf_jit_comp.c +++ b/arch/powerpc/net/bpf_jit_comp.c@@ -72,6 +72,21 @@ static int bpf_jit_fixup_subprog_calls(structbpf_prog *fp, u32 *image, return 0; } +int bpf_jit_emit_exit_insn(u32 *image, struct codegen_context *ctx, + int tmp_reg, unsigned long exit_addr) +{ + if (!(ctx->seen & SEEN_BIG_PROG) && is_offset_in_branch_range(exit_addr)) { + PPC_JMP(exit_addr); + } else { + ctx->seen |= SEEN_BIG_PROG; + PPC_FUNC_ADDR(tmp_reg, (unsigned long)image + exit_addr); + EMIT(PPC_RAW_MTCTR(tmp_reg)); + EMIT(PPC_RAW_BCTR()); + } + + return 0; +} + struct powerpc64_jit_data { struct bpf_binary_header *header; u32 *addrs;@@ -155,12 +170,17 @@ struct bpf_prog *bpf_int_jit_compile(structbpf_prog *fp) goto out_addrs; } + if (!is_offset_in_branch_range((long)cgctx.idx * 4)) + cgctx.seen |= SEEN_BIG_PROG; + /* * If we have seen a tail call, we need a second pass. * This is because bpf_jit_emit_common_epilogue() is called * from bpf_jit_emit_tail_call() with a not yet stable ctx->seen. + * We also need a second pass if we ended up with too large + * a program so as to fix branches. */ - if (cgctx.seen & SEEN_TAILCALL) { + if (cgctx.seen & (SEEN_TAILCALL | SEEN_BIG_PROG)) { cgctx.idx = 0; if (bpf_jit_build_body(fp, 0, &cgctx, addrs, false)) { fp = org_fp;diff --git a/arch/powerpc/net/bpf_jit_comp32.cb/arch/powerpc/net/bpf_jit_comp32.c index a74d52204f8da2..d2a67574a23066 100644--- a/arch/powerpc/net/bpf_jit_comp32.c +++ b/arch/powerpc/net/bpf_jit_comp32.c@@ -852,7 +852,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32*image, struct codegen_context * * we'll just fall through to the epilogue. */ if (i != flen - 1) - PPC_JMP(exit_addr); + bpf_jit_emit_exit_insn(image, ctx, tmp_reg, exit_addr);On ppc32, if you use tmp_reg you must flag it. But I think you could use r0 instead.Indeed. Can we drop tracking of the temp registers and using them while remapping registers? Are you seeing significant benefits with re-use of those temp registers?I'm not sure to follow you. On ppc32, all volatile registers are used for function arguments, so temp registers are necessarily non-volatile so we track them as all non-volatile registers we use. I think saving on stack only the non-volatile registers we use provides real benefit, otherwise you wouldn't have implemented it would you ?
You're right. I was wary of having to track temporary register usage, which is a bit harder and prone to mistakes like the above. A related concern was that the register remapping is only used if there are no helper calls, which looks like a big limitation. But, I do agree that it is worth the trouble for ppc32 given the register usage.
Anyway here you should use _R0 instead of tmp_reg.
Thanks, Naveen