Thread (5 messages) 5 messages, 2 authors, 2025-09-11

Re: [PATCH] powerpc64/modules: fix ool-ftrace-stub vs. livepatch relocation corruption

From: Naveen N Rao <naveen@kernel.org>
Date: 2025-09-08 11:06:37
Also in: live-patching

On Wed, Sep 03, 2025 at 10:37:39PM -0400, Joe Lawrence wrote:
On Wed, Sep 03, 2025 at 10:29:50PM -0400, Joe Lawrence wrote:
quoted
The powerpc64 module .stubs section holds ppc64_stub_entry[] code
trampolines that are generated at module load time. These stubs are
necessary for function calls to external symbols that are too far away
for a simple relative branch.

Logic for finding an available ppc64_stub_entry has relied on a sentinel
value in the funcdata member to indicate a used slot. Code iterates
through the array, inspecting .funcdata to find the first unused (zeroed)
entry:

  for (i = 0; stub_func_addr(stubs[i].funcdata); i++)

To support CONFIG_PPC_FTRACE_OUT_OF_LINE, a new setup_ftrace_ool_stubs()
function extended the .stubs section by adding an array of
ftrace_ool_stub structures for each patchable function. A side effect
of writing these smaller structures is that the funcdata sentinel
convention is not maintained.
There is clearly a bug in how we are reserving the stubs as you point 
out further below, but once that is properly initialized, I don't think 
the smaller structure size for ftrace_ool_stub matters (in so far as 
stub->funcdata being non-NULL). We end up writing four valid powerpc 
instructions, along with a ftrace_ops pointer before those instructions 
which should also be non-zero (there is a bug here too, more on that 
below).  The whole function descriptor dance does complicate matters a 
bit though.
quoted
This is not a problem for an ordinary
kernel module, as this occurs in module_finalize(), after which no
further .stubs updates are needed.

However, when loading a livepatch module that contains klp-relocations,
apply_relocate_add() is executed a second time, after the out-of-line
ftrace stubs have been set up.

When apply_relocate_add() then calls stub_for_addr() to handle a
klp-relocation, its search for the next available ppc64_stub_entry[]
slot may stop prematurely in the middle of the ftrace_ool_stub array. A
full ppc64_stub_entry is then written, corrupting the ftrace stubs.

Fix this by explicitly tracking the count of used ppc64_stub_entrys.
Rather than relying on an inline funcdata sentinel value, a new
stub_count is used as the explicit boundary for searching and allocating
stubs. This simplifies the code, eliminates the "one extra reloc" that
was required for the sentinel check, and resolves the memory corruption.
Apologies if this is too wordy, I wrote it as a bit of a braindump to
summarize the longer analysis at the bottom of the reply ...
This was a good read, thanks for all the details. It did help spot 
another issue.
quoted
Fixes: eec37961a56a ("powerpc64/ftrace: Move ftrace sequence out of line")
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
---
 arch/powerpc/include/asm/module.h |  1 +
 arch/powerpc/kernel/module_64.c   | 26 ++++++++------------------
 2 files changed, 9 insertions(+), 18 deletions(-)
diff --git a/arch/powerpc/include/asm/module.h b/arch/powerpc/include/asm/module.h
index e1ee5026ac4a..864e22deaa2c 100644
--- a/arch/powerpc/include/asm/module.h
+++ b/arch/powerpc/include/asm/module.h
@@ -27,6 +27,7 @@ struct ppc_plt_entry {
 struct mod_arch_specific {
 #ifdef __powerpc64__
 	unsigned int stubs_section;	/* Index of stubs section in module */
+	unsigned int stub_count;	/* Number of stubs used */
 #ifdef CONFIG_PPC_KERNEL_PCREL
 	unsigned int got_section;	/* What section is the GOT? */
 	unsigned int pcpu_section;	/* .data..percpu section */
diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
index 126bf3b06ab7..2a44bc8e2439 100644
--- a/arch/powerpc/kernel/module_64.c
+++ b/arch/powerpc/kernel/module_64.c
@@ -209,8 +209,7 @@ static unsigned long get_stubs_size(const Elf64_Ehdr *hdr,
 				    char *secstrings,
 				    struct module *me)
 {
-	/* One extra reloc so it's always 0-addr terminated */
-	unsigned long relocs = 1;
+	unsigned long relocs = 0;
 	unsigned i;
 
 	/* Every relocated section... */
@@ -705,7 +704,7 @@ static unsigned long stub_for_addr(const Elf64_Shdr *sechdrs,
 
 	/* Find this stub, or if that fails, the next avail. entry */
 	stubs = (void *)sechdrs[me->arch.stubs_section].sh_addr;
-	for (i = 0; stub_func_addr(stubs[i].funcdata); i++) {
+	for (i = 0; i < me->arch.stub_count; i++) {
 		if (WARN_ON(i >= num_stubs))
 			return 0;
 
@@ -716,6 +715,7 @@ static unsigned long stub_for_addr(const Elf64_Shdr *sechdrs,
 	if (!create_stub(sechdrs, &stubs[i], addr, me, name))
 		return 0;
 
+	me->arch.stub_count++;
 	return (unsigned long)&stubs[i];
 }
 
@@ -1118,29 +1118,19 @@ int module_trampoline_target(struct module *mod, unsigned long addr,
 static int setup_ftrace_ool_stubs(const Elf64_Shdr *sechdrs, unsigned long addr, struct module *me)
 {
 #ifdef CONFIG_PPC_FTRACE_OUT_OF_LINE
-	unsigned int i, total_stubs, num_stubs;
+	unsigned int total_stubs, num_stubs;
 	struct ppc64_stub_entry *stub;
 
 	total_stubs = sechdrs[me->arch.stubs_section].sh_size / sizeof(*stub);
 	num_stubs = roundup(me->arch.ool_stub_count * sizeof(struct ftrace_ool_stub),
 			    sizeof(struct ppc64_stub_entry)) / sizeof(struct ppc64_stub_entry);
 
-	/* Find the next available entry */
-	stub = (void *)sechdrs[me->arch.stubs_section].sh_addr;
-	for (i = 0; stub_func_addr(stub[i].funcdata); i++)
-		if (WARN_ON(i >= total_stubs))
-			return -1;
-
-	if (WARN_ON(i + num_stubs > total_stubs))
+	if (WARN_ON(me->arch.stub_count + num_stubs > total_stubs))
 		return -1;
 
-	stub += i;
-	me->arch.ool_stubs = (struct ftrace_ool_stub *)stub;
-
-	/* reserve stubs */
-	for (i = 0; i < num_stubs; i++)
-		if (patch_u32((void *)&stub->funcdata, PPC_RAW_NOP()))
-			return -1;
At first I thought the bug was that this loop was re-writting the same
PPC_RAW_NOP() to the same funcdata (i.e, should have been something
like: patch_u32((void *)stub[i]->funcdata, PPC_RAW_NOP())), but that
didn't work and seemed fragile anyway.
D'uh - this path was clearly never tested. I suppose this should have 
been something like this:
	patch_ulong((void *)&stub[i]->funcdata, func_desc(local_paca))

Still convoluted, but I think that should hopefully fix the problem you 
are seeing.
quoted
+	stub = (void *)sechdrs[me->arch.stubs_section].sh_addr;
+	me->arch.ool_stubs = (struct ftrace_ool_stub *)(stub + me->arch.stub_count);
+	me->arch.stub_count += num_stubs;
 #endif
Regardless of the above, your proposed change looks good to me and 
simplifies the logic. So:
Acked-by: Naveen N Rao (AMD) <naveen@kernel.org>
  crash> dis 0xc008000007d70dd0 42
  ppc64[ ]   ftrace[0]    <xfs_stats_format+0x558>:    .long 0x0
                          <xfs_stats_format+0x55c>:    .long 0x0
                          <xfs_stats_format+0x560>:    mflr    r0
                          <xfs_stats_format+0x564>:    bl      0xc008000007d70d80 <xfs_stats_format+0x508>
                          <xfs_stats_format+0x568>:    mtlr    r0
                          <xfs_stats_format+0x56c>:    b       0xc008000007d70014 <patch_free_livepatch+0xc>
             ftrace[1]    <xfs_stats_format+0x570>:    .long 0x0
                          <xfs_stats_format+0x574>:    .long 0x0
                          <xfs_stats_format+0x578>:    mflr    r0
                          <xfs_stats_format+0x57c>:    bl      0xc008000007d70d80 <xfs_stats_format+0x508>
  ppc64[ ]                <xfs_stats_format+0x580>:    addis   r11,r2,4                                         << This looks like a full
                          <xfs_stats_format+0x584>:    addi    r11,r11,-29448                                   << ppc64_stub_entry
             ftrace[2]    <xfs_stats_format+0x588>:    std     r2,24(r1)                                        << dropped in the middle
                          <xfs_stats_format+0x58c>:    ld      r12,32(r11)                                      << of the ool_stubs array
                          <xfs_stats_format+0x590>:    mtctr   r12                                              << of ftrace_ool_stub[]
                          <xfs_stats_format+0x594>:    bctr                                                     <<
                          <xfs_stats_format+0x598>:    mtlr    r0                                               <<
                          <xfs_stats_format+0x59c>:    andi.   r20,r27,30050                                    <<
             ftrace[3]    <xfs_stats_format+0x5a0>:    .long 0x54e92b8                                          <<
                          <xfs_stats_format+0x5a4>:    lfs     f0,0(r8)                                         <<
  ppc64[ ]                <xfs_stats_format+0x5a8>:    mflr    r0
                          <xfs_stats_format+0x5ac>:    bl      0xc008000007d70d80 <xfs_stats_format+0x508>
                          <xfs_stats_format+0x5b0>:    mtlr    r0
                          <xfs_stats_format+0x5b4>:    b       0xc008000007d7037c <add_callbacks_to_patch_objects+0xc>
             ftrace[4]    <xfs_stats_format+0x5b8>:    .long 0x0
                          <xfs_stats_format+0x5bc>:    .long 0x0
These NULL values are also problematic. I think those are NULL since we 
are not "reserving" the stubs properly, but those should point to some 
ftrace_op. I think we are missing a call to ftrace_rec_set_nop_ops() in 
ftrace_init_nop(), which would be good to do separately.


- Naveen

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help