Re: [PATCH v2 3/6] powerpc/module: Optimise nearby branches in ELF V2 ABI stub
From: Christophe Leroy <hidden>
Date: 2022-09-26 14:50:28
Le 26/09/2022 à 08:43, Benjamin Gray a écrit :
quoted hunk ↗ jump to hunk
Inserts a direct branch to the stub target when possible, replacing the mtctr/btctr sequence. The load into r12 could potentially be skipped too, but that change would need to refactor the arguments to indicate that the address does not have a separate local entry point. This helps the static call implementation, where modules calling their own trampolines are called through this stub and the trampoline is easily within range of a direct branch. Signed-off-by: Benjamin Gray <redacted> --- arch/powerpc/kernel/module_64.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c index 4d816f7785b4..745ce9097dcf 100644 --- a/arch/powerpc/kernel/module_64.c +++ b/arch/powerpc/kernel/module_64.c@@ -141,6 +141,12 @@ static u32 ppc64_stub_insns[] = { PPC_RAW_BCTR(), }; +#ifdef CONFIG_PPC64_ELF_ABI_V1 +#define PPC64_STUB_MTCTR_OFFSET 5 +#else +#define PPC64_STUB_MTCTR_OFFSET 4 +#endif + /* Count how many different 24-bit relocations (different symbol, different addend) */ static unsigned int count_relocs(const Elf64_Rela *rela, unsigned int num)@@ -429,6 +435,8 @@ static inline int create_stub(const Elf64_Shdr *sechdrs, long reladdr; func_desc_t desc; int i; + u32 *jump_seq_addr = &entry->jump[PPC64_STUB_MTCTR_OFFSET]; + ppc_inst_t direct; if (is_mprofile_ftrace_call(name)) return create_ftrace_stub(entry, addr, me);@@ -439,6 +447,11 @@ static inline int create_stub(const Elf64_Shdr *sechdrs, return 0; } + /* Replace indirect branch sequence with direct branch where possible */ + if (!create_branch(&direct, jump_seq_addr, addr, 0)) + if (patch_instruction(jump_seq_addr, direct))
Why not use patch_branch() ?
+ return 0;
+
/* Stub uses address relative to r2. */
reladdr = (unsigned long)entry - my_r2(sechdrs, me);
if (reladdr > 0x7FFFFFFF || reladdr < -(0x80000000L)) {