Thread (11 messages) 11 messages, 3 authors, 2022-09-13

Re: [RFC PATCH 0/4] Out-of-line static calls for powerpc64 ELF V2

From: Christophe Leroy <hidden>
Date: 2022-09-01 08:08:13

CCing static call maintainers/reviewers.

And note that my email address has changed to 
christophe.leroy@csgroup.eu monthes ago.

Le 01/09/2022 à 07:58, Benjamin Gray a écrit :
[Vous ne recevez pas souvent de courriers de bgray@linux.ibm.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]

WIP implementation of out-of-line static calls for PowerPC 64-bit ELF V2 ABI.
Static calls patch an indirect branch into a direct branch at runtime.
Out-of-line specifically has a caller directly call a trampoline, and
the trampoline gets patched to directly call the target. This current
implementation has a known issue described in detail below, and is
presented here for any comments or suggestions.
For a wider audience I recommend you to copy people from the core STATIC 
BRANCH/CALL (see MAINTAINERS file)

64-bit ELF V2 specifies a table of contents (TOC) pointer stored in r2.
Functions that use a TOC can use it to perform various operations
relative to its value. When the caller and target use different TOCs,
the static call implementation must ensure the TOC is kept consistent
so that neither tries to use the other's TOC.

However, while the trampoline can change the caller's TOC to the target's
TOC, it cannot restore the caller's TOC when the target returns. For the
trampoline to do this would require the target to return to the trampoline,
and so the return address back to the caller would need to be saved to
the stack. But the trampoline cannot move the stack because the target
may be expecting parameters relative to the stack pointer (when registers
are insufficient or varargs are used). And as static calls are usable in
generic code, there can be no arch-specific restrictions on parameters
that would sidestep this issue.

Normally the TOC change issue is resolved by the caller, which will save
and restore its TOC if necessary. For static calls though the caller
sees the trampoline as a local function, so assumes it does not change
the TOC and treats r2 as nonvolatile (no save & restore added).

This is a simialar problem to that faced by livepatch. Static calls may have
a few more options though, because the call is performed through a
`static_call` macro, allowing annotation and insertion of inline assembly
at every callsite.

I can think of several possible solutions, but they are relatively complex:

1. Patching the callsites at runtime, as is done for inline static calls.
     This also requires some inline assembly to save `r2` to the TOC pointer
     Doubleword slot on the stack before each static call, as the caller may
     not have done so in its prologue. It should be easy to add though, because
     static calls are invoked through the `static_call` macro that can be
     modified appropriately. The runtime patching then modifies the trailing
     function call `nop` to restore this r2 value.
I'm working at implementing inline static calls for ppc32. Will copy you 
next spin (If I don't forget).
     The patching itself can probably be done at compile time at kernel callsites.

2. Use the livepatch stack method of growing the base of the stack backwards.
     I haven't looked too closely at the implementation though, especially
     regarding how much room is available.

     The benefit of this method is that there can be zero overhead when the
     trampoline and target share a TOC. So the trampoline in kernel-only
     calls can still just be a single direct branch.

3. Remove the local entry point from the trampoline. This makes the trampoline
     less efficient, as it cannot assume r2 to be correct, but should at least
     cause the caller to automatically save and restore r2 without manual patching.
     From the ABI manual:

     > 2.2.1. Function Call Linkage Protocols
     >   A function that uses a TOC pointer always has a separate local entry point
     >   [...], and preserves r2 when called via its local entry point.
     >
     > 2.2.2.1. Register Roles
     >   (a) Register r2 is nonvolatile with respect to calls between functions
     >       in the same compilation unit, except under the conditions in footnote (b)
     >   (b) Register r2 is volatile and available for use in a function that does not
     >       use a TOC pointer and that does not guarantee that it preserves r2.

     So not having a local entry point implies not using a TOC pointer, which
     implies r2 is volatile if the trampoline does not guarantee that it preserves
     r2. However experimenting with such a trampoline showed the caller still did
     not preserve its TOC when necessary, even when the trampoline used instructions
     that wrote to r2. Possibly there's an attribute that can be used to mark the
     necessary info, but I could not find one.
Another possible solution (at least for kernel) is to restore r2 from 
PACA instead of restoring it from the stack. So no worry whether the 
caller stored it or not. Something similar is done by module code, see 
comment before create_ftrace_stub()


Benjamin Gray (3):
   static_call: Move static call selftest to static_call.c
   powerpc/64: Add support for out-of-line static calls
   powerpc/64: Add tests for out-of-line static calls

Russell Currey (1):
   powerpc/code-patching: add patch_memory() for writing RO text

  arch/powerpc/Kconfig                     |  23 +-
  arch/powerpc/include/asm/code-patching.h |   2 +
  arch/powerpc/include/asm/static_call.h   |  45 +++-
  arch/powerpc/kernel/Makefile             |   4 +-
  arch/powerpc/kernel/static_call.c        | 184 +++++++++++++++-
  arch/powerpc/kernel/static_call_test.c   | 257 +++++++++++++++++++++++
  arch/powerpc/lib/code-patching.c         |  65 ++++++
  kernel/static_call.c                     |  43 ++++
  kernel/static_call_inline.c              |  43 ----
  9 files changed, 613 insertions(+), 53 deletions(-)
  create mode 100644 arch/powerpc/kernel/static_call_test.c


base-commit: c5e4d5e99162ba8025d58a3af7ad103f155d2df7
--
2.37.2
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help