Re: [PATCH v9] livepatch: Clear relocation targets on a module removal
From: Joe Lawrence <joe.lawrence@redhat.com>
Date: 2023-01-24 20:09:20
Also in:
live-patching, lkml
On 1/20/23 01:42, Josh Poimboeuf wrote:
quoted hunk ↗ jump to hunk
On Thu, Jan 19, 2023 at 11:06:35AM -0800, Song Liu wrote:quoted
On Wed, Jan 18, 2023 at 2:08 PM Josh Poimboeuf [off-list ref] wrote:quoted
On Wed, Jan 18, 2023 at 12:47:28PM -0800, Song Liu wrote:quoted
From: Miroslav Benes <mbenes@suse.cz> Josh reported a bug: When the object to be patched is a module, and that module is rmmod'ed and reloaded, it fails to load with: module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8) livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd' The livepatch module has a relocation which references a symbol in the _previous_ loading of nfsd. When apply_relocate_add() tries to replace the old relocation with a new one, it sees that the previous one is nonzero and it errors out. On ppc64le, we have a similar issue: module_64: livepatch_nfsd: Expected nop after call, got e8410018 at e_show+0x60/0x548 [livepatch_nfsd] livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8) livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'Shouldn't there also be a fix for this powerpc issue?There was a working version, but it was not very clean. We couldn't agree on the path forward for powerpc, so we are hoping to ship the fix to x86 (and s390?) first [1].Sorry for coming in late, I was on leave so I missed a lot of the discussions on previous versions. The decision to leave powerpc broken wasn't clear from reading the commit message. The bug is mentioned, and the fix is implied, but surprisingly there's no fix. I agree that the powerpc fix should be in a separate patch, but I still don't feel comfortable merging the x86 fix without the corresponding powerpc fix. powerpc is a major arch and not a second-class citizen. If we don't fix it now then it'll probably never get fixed until it blows up in the real world. For powerpc, instead of clearing, how about just "fixing" the warning site, something like so (untested)?diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c index 1096d6b3a62c..1a12463ba674 100644 --- a/arch/powerpc/kernel/module_64.c +++ b/arch/powerpc/kernel/module_64.c@@ -499,9 +499,11 @@ static unsigned long stub_for_addr(const Elf64_Shdr *sechdrs, /* We expect a noop next: if it is, replace it with instruction to restore r2. */ -static int restore_r2(const char *name, u32 *instruction, struct module *me) +static int restore_r2(const char *name, u32 *instruction, struct module *me, + bool klp_sym) { u32 *prev_insn = instruction - 1; + u32 insn_val = *instruction; if (is_mprofile_ftrace_call(name)) return 1;@@ -514,9 +516,18 @@ static int restore_r2(const char *name, u32 *instruction, struct module *me) if (!instr_is_relative_link_branch(ppc_inst(*prev_insn))) return 1; - if (*instruction != PPC_RAW_NOP()) { + /* + * For a livepatch relocation, the restore r2 instruction might have + * been previously written if the relocation references a symbol in a + * module which was unloaded and is now being reloaded. In that case, + * skip the warning and instruction write. + */ + if (klp_sym && insn_val == PPC_INST_LD_TOC) + return 0;
Hi Josh, Nit: shouldn't this return 1? And if you're willing to entertain a small refactor, wouldn't restore_r2() be clearer if it returned -ESOMETHING on error? Maybe converting to a boolean could work, but then I'd suggest a name that clearly implies success/fail given true/false return. Maybe replace_nop_with_ld_toc() or replace_nop_to_restore_r2() ... still -ESOMETHING is more intuitive to me as there are cases like this where the function safely returns w/o replacing anything.
quoted hunk ↗ jump to hunk
+ + if (insn_val != PPC_RAW_NOP()) { pr_err("%s: Expected nop after call, got %08x at %pS\n", - me->name, *instruction, instruction); + me->name, insn_val, instruction); return 0; }@@ -649,7 +660,8 @@ int apply_relocate_add(Elf64_Shdr *sechdrs, if (!value) return -ENOENT; if (!restore_r2(strtab + sym->st_name, - (u32 *)location + 1, me)) + (u32 *)location + 1, me, + sym->st_shndx == SHN_LIVEPATCH)) return -ENOEXEC; } else value += local_entry_offset(sym);
klp-convert-tree tests* ran OK with this patch (with the nit fixed) on top of Song's v10. LMK if you want me to push a branch with some or all of these patches for further testing. * I removed the tests that check for relocation clearing, only tested module reloading -- Joe