Thread (42 messages) 42 messages, 10 authors, 2016-02-10

Re: [PATCH v6 1/9] ppc64 (le): prepare for -mprofile-kernel

From: Petr Mladek <pmladek@suse.com>
Date: 2016-02-04 11:03:47
Also in: lkml

On Thu 2016-02-04 18:31:40, AKASHI Takahiro wrote:
Jiri, Torsten

Thank you for your explanation.

On 02/03/2016 08:24 PM, Torsten Duwe wrote:
quoted
On Wed, Feb 03, 2016 at 09:55:11AM +0100, Jiri Kosina wrote:
quoted
On Wed, 3 Feb 2016, AKASHI Takahiro wrote:
quoted
those efforts, we are proposing[1] a new *generic* gcc option, -fprolog-add=N.
This option will insert N nop instructions at the beginning of each function.
quoted
The interesting part of the story with ppc64 is that you indeed want to
create the callsite before the *most* of the prologue, but not really :)
I was silently assuming that GCC would do this right on ppc64le; add the NOPs
right after the TOC load. Or after TOC load and LR save? ...
On arm/arm64, link register must be saved before any function call. So anyhow
we will have to add something, 3 instructions at the minimum, like:
   save lr
   branch _mcount
   restore lr
   <prologue>
   ...
   <body>
   ...
So, it is similar to PPC that has to handle LR as well.

quoted
quoted
The part of the prologue where TOC pointer is saved needs to happen before
the fentry/profiling call.
Yes, any call, to any profiler/tracer/live patcher is potentially global
and needs the _new_ TOC value.
The code below is generated for PPC64LE with -mprofile-kernel using:

$> gcc --version
gcc (SUSE Linux) 6.0.0 20160121 (experimental) [trunk revision 232670]
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


0000000000000050 <cmdline_proc_show>:
  50:   00 00 4c 3c     addis   r2,r12,0
                        50: R_PPC64_REL16_HA    .TOC.
  54:   00 00 42 38     addi    r2,r2,0
                        54: R_PPC64_REL16_LO    .TOC.+0x4
  58:   a6 02 08 7c     mflr    r0
  5c:   01 00 00 48     bl      5c <cmdline_proc_show+0xc>
                        5c: R_PPC64_REL24       _mcount
  60:   a6 02 08 7c     mflr    r0
  64:   10 00 01 f8     std     r0,16(r1)
  68:   a1 ff 21 f8     stdu    r1,-96(r1)
  6c:   00 00 22 3d     addis   r9,r2,0
                        6c: R_PPC64_TOC16_HA    .toc
  70:   00 00 82 3c     addis   r4,r2,0
                        70: R_PPC64_TOC16_HA    .rodata.str1.8
  74:   00 00 29 e9     ld      r9,0(r9)
                        74: R_PPC64_TOC16_LO_DS .toc
  78:   00 00 84 38     addi    r4,r4,0
                        78: R_PPC64_TOC16_LO    .rodata.str1.8
  7c:   00 00 a9 e8     ld      r5,0(r9)
  80:   01 00 00 48     bl      80 <cmdline_proc_show+0x30>
                        80: R_PPC64_REL24       seq_printf
  84:   00 00 00 60     nop
  88:   00 00 60 38     li      r3,0
  8c:   60 00 21 38     addi    r1,r1,96
  90:   10 00 01 e8     ld      r0,16(r1)
  94:   a6 03 08 7c     mtlr    r0
  98:   20 00 80 4e     blr


And the same function compiled using:

$> gcc --version
gcc (SUSE Linux) 4.8.5
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


0000000000000050 <cmdline_proc_show>:
  50:   00 00 4c 3c     addis   r2,r12,0
                        50: R_PPC64_REL16_HA    .TOC.
  54:   00 00 42 38     addi    r2,r2,0
                        54: R_PPC64_REL16_LO    .TOC.+0x4
  58:   a6 02 08 7c     mflr    r0
  5c:   10 00 01 f8     std     r0,16(r1)
  60:   01 00 00 48     bl      60 <cmdline_proc_show+0x10>
                        60: R_PPC64_REL24       _mcount
  64:   a6 02 08 7c     mflr    r0
  68:   10 00 01 f8     std     r0,16(r1)
  6c:   a1 ff 21 f8     stdu    r1,-96(r1)
  70:   00 00 42 3d     addis   r10,r2,0
                        70: R_PPC64_TOC16_HA    .toc
  74:   00 00 82 3c     addis   r4,r2,0
                        74: R_PPC64_TOC16_HA    .rodata.str1.8
  78:   00 00 2a e9     ld      r9,0(r10)
                        78: R_PPC64_TOC16_LO_DS .toc
  7c:   00 00 84 38     addi    r4,r4,0
                        7c: R_PPC64_TOC16_LO    .rodata.str1.8
  80:   00 00 a9 e8     ld      r5,0(r9)
  84:   01 00 00 48     bl      84 <cmdline_proc_show+0x34>
                        84: R_PPC64_REL24       seq_printf
  88:   00 00 00 60     nop
  8c:   00 00 60 38     li      r3,0
  90:   60 00 21 38     addi    r1,r1,96
  94:   10 00 01 e8     ld      r0,16(r1)
  98:   a6 03 08 7c     mtlr    r0
  9c:   20 00 80 4e     blr


Please, note that are used either 3 or 4 instructions before the
mcount location depending on the compiler version.

Best Regards,
Petr
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help