Re: [PATCH 01/17] Fix "x86/alternatives: Lockdep-enforce text_mutex in text_poke*()"
From: hpa@zytor.com
Date: 2019-01-17 23:00:04
Also in:
linux-integrity, linux-mm, lkml
On January 17, 2019 2:39:15 PM PST, Nadav Amit [off-list ref] wrote:
quoted
On Jan 17, 2019, at 1:15 PM, hpa@zytor.com wrote: On January 16, 2019 10:47:01 PM PST, Masami Hiramatsu[off-list ref] wrote:quoted
quoted
On Wed, 16 Jan 2019 16:32:43 -0800 Rick Edgecombe [off-list ref] wrote:quoted
From: Nadav Amit <redacted> text_mutex is currently expected to be held before text_poke() is called, but we kgdb does not take the mutex, and instead*supposedly*quoted
quoted
quoted
ensures the lock is not taken and will not be acquired by any othercorequoted
while text_poke() is running. The reason for the "supposedly" comment is that it is not entirelyclearquoted
that this would be the case if gdb_do_roundup is zero. This patch creates two wrapper functions, text_poke() and text_poke_kgdb() which do or do not run the lockdep assertion respectively. While we are at it, change the return code of text_poke() tosomethingquoted
meaningful. One day, callers might actually respect it and theexistingquoted
BUG_ON() when patching fails could be removed. For kgdb, the return value can actually be used.Looks good to me. Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Thank you,quoted
Cc: Andy Lutomirski <luto@kernel.org> Cc: Kees Cook <redacted> Cc: Dave Hansen <redacted> Cc: Masami Hiramatsu <mhiramat@kernel.org> Fixes: 9222f606506c ("x86/alternatives: Lockdep-enforce text_mutexinquoted
quoted
text_poke*()")quoted
Suggested-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Jiri Kosina <redacted> Signed-off-by: Nadav Amit <redacted> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> --- arch/x86/include/asm/text-patching.h | 1 + arch/x86/kernel/alternative.c | 52++++++++++++++++++++--------quoted
arch/x86/kernel/kgdb.c | 11 +++--- 3 files changed, 45 insertions(+), 19 deletions(-)diff --git a/arch/x86/include/asm/text-patching.hb/arch/x86/include/asm/text-patching.hquoted
index e85ff65c43c3..f8fc8e86cf01 100644--- a/arch/x86/include/asm/text-patching.h +++ b/arch/x86/include/asm/text-patching.h@@ -35,6 +35,7 @@ extern void *text_poke_early(void *addr, constvoidquoted
quoted
*opcode, size_t len);quoted
* inconsistent instruction while you patch. */ extern void *text_poke(void *addr, const void *opcode, size_t len); +extern void *text_poke_kgdb(void *addr, const void *opcode, size_tlen);quoted
extern int poke_int3_handler(struct pt_regs *regs); extern void *text_poke_bp(void *addr, const void *opcode, size_tlen, void *handler);quoted
extern int after_bootmem;diff --git a/arch/x86/kernel/alternative.cb/arch/x86/kernel/alternative.cquoted
index ebeac487a20c..c6a3a10a2fd5 100644--- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c@@ -678,18 +678,7 @@ void *__init_or_module text_poke_early(void*addr, const void *opcode,quoted
return addr; } -/** - * text_poke - Update instructions on a live kernel - * @addr: address to modify - * @opcode: source of the copy - * @len: length to copy - * - * Only atomic text poke/set should be allowed when not doingearlyquoted
quoted
patching.quoted
- * It means the size must be writable atomically and the addressmust be alignedquoted
- * in a way that permits an atomic write. It also makes sure wefitquoted
quoted
on a singlequoted
- * page. - */ -void *text_poke(void *addr, const void *opcode, size_t len) +static void *__text_poke(void *addr, const void *opcode, size_tlen)quoted
quoted
quoted
{ unsigned long flags; char *vaddr;@@ -702,8 +691,6 @@ void *text_poke(void *addr, const void *opcode,size_t len)quoted
*/ BUG_ON(!after_bootmem); - lockdep_assert_held(&text_mutex); - if (!core_kernel_text((unsigned long)addr)) { pages[0] = vmalloc_to_page(addr); pages[1] = vmalloc_to_page(addr + PAGE_SIZE);@@ -732,6 +719,43 @@ void *text_poke(void *addr, const void*opcode,quoted
quoted
size_t len)quoted
return addr; } +/** + * text_poke - Update instructions on a live kernel + * @addr: address to modify + * @opcode: source of the copy + * @len: length to copy + * + * Only atomic text poke/set should be allowed when not doingearlyquoted
quoted
patching.quoted
+ * It means the size must be writable atomically and the addressmust be alignedquoted
+ * in a way that permits an atomic write. It also makes sure wefitquoted
quoted
on a singlequoted
+ * page. + */ +void *text_poke(void *addr, const void *opcode, size_t len) +{ + lockdep_assert_held(&text_mutex); + + return __text_poke(addr, opcode, len); +} + +/** + * text_poke_kgdb - Update instructions on a live kernel by kgdb + * @addr: address to modify + * @opcode: source of the copy + * @len: length to copy + * + * Only atomic text poke/set should be allowed when not doingearlyquoted
quoted
patching.quoted
+ * It means the size must be writable atomically and the addressmust be alignedquoted
+ * in a way that permits an atomic write. It also makes sure wefitquoted
quoted
on a singlequoted
+ * page. + * + * Context: should only be used by kgdb, which ensures no othercorequoted
quoted
is running,quoted
+ * despite the fact it does not hold the text_mutex. + */ +void *text_poke_kgdb(void *addr, const void *opcode, size_t len) +{ + return __text_poke(addr, opcode, len); +} + static void do_sync_core(void *info) { sync_core();diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c index 5db08425063e..1461544cba8b 100644 --- a/arch/x86/kernel/kgdb.c +++ b/arch/x86/kernel/kgdb.c@@ -758,13 +758,13 @@ int kgdb_arch_set_breakpoint(struct kgdb_bkpt*bpt)quoted
if (!err) return err; /* - * It is safe to call text_poke() because normal kernel execution + * It is safe to call text_poke_kgdb() because normal kernelexecutionquoted
* is stopped on all cores, so long as the text_mutex is notlocked.quoted
*/ if (mutex_is_locked(&text_mutex)) return -EBUSY; - text_poke((void *)bpt->bpt_addr, arch_kgdb_ops.gdb_bpt_instr, - BREAK_INSTR_SIZE); + text_poke_kgdb((void *)bpt->bpt_addr,arch_kgdb_ops.gdb_bpt_instr,quoted
quoted
quoted
+ BREAK_INSTR_SIZE); err = probe_kernel_read(opc, (char *)bpt->bpt_addr,BREAK_INSTR_SIZE);quoted
if (err) return err;@@ -783,12 +783,13 @@ int kgdb_arch_remove_breakpoint(structkgdb_bkpt *bpt)quoted
if (bpt->type != BP_POKE_BREAKPOINT) goto knl_write; /* - * It is safe to call text_poke() because normal kernel execution + * It is safe to call text_poke_kgdb() because normal kernelexecutionquoted
* is stopped on all cores, so long as the text_mutex is notlocked.quoted
*/ if (mutex_is_locked(&text_mutex)) goto knl_write; - text_poke((void *)bpt->bpt_addr, bpt->saved_instr,BREAK_INSTR_SIZE);quoted
+ text_poke_kgdb((void *)bpt->bpt_addr, bpt->saved_instr, + BREAK_INSTR_SIZE); err = probe_kernel_read(opc, (char *)bpt->bpt_addr,BREAK_INSTR_SIZE);quoted
if (err || memcmp(opc, bpt->saved_instr, BREAK_INSTR_SIZE)) goto knl_write; -- 2.17.1If you are reorganizing this code, please do so so that the callerdoesn’tquoted
have to worry about if it should call text_poke_bp() ortext_poke_early().quoted
Right now the caller had to know that, which makes no sense.Did you look at "[11/17] x86/jump-label: remove support for custom poker”? https://lore.kernel.org/patchwork/patch/1032857/ If this is not what you regard, please be more concrete. text_poke_early() is still used directly on init and while modules are loaded, which might not be great, but is outside of the scope of this patch-set.
I don't think it is out of scope, although that patch is a huge step in the right direction.
text_poke_{early,bp,...}, however, should be fully internal, that is, static functions, and we should present a single interface, preferably called text_poke(), to the outside world.
I think we have three subcases:
1. Early, UP, or under stop_machine();
2. Atomic and aligned;
3. Breakpoint.
My proposed algorithm should remove the need for a fixup which should help this interface, too.
The specific alignment needed for #2 is started by the hardware people to be not crossing 16 bytes (NOT a cache line) on any CPU we support SMP on and, of course, being possible to do atomically do on the specific CPU (note that we *can* do a redundantly large store of existing bytes, which adds flexibility.)
To the best of my knowledge any CPU supporting SSE can do an atomic (for our purposes) aligned 16-byte store via MOVAPS; of course any CPU with cx16 can do it without SSE registers. For older CPUs we may be limited to 8-byte stores (cx8) or even 4-byte stores before we need to use the breakpoint algorithm.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.