Thread (40 messages) 40 messages, 9 authors, 2016-08-10

Re: [PATCH 2/5] kbuild: allow archs to select build for link dead code/data elimination

From: Nicholas Piggin <npiggin@gmail.com>
Date: 2016-08-08 03:42:45
Also in: linux-kbuild, linuxppc-dev

On Sun, 7 Aug 2016 01:33:45 -0400 (EDT)
Nicolas Pitre [off-list ref] wrote:
On Fri, 5 Aug 2016, Nicholas Piggin wrote:
quoted
Introduce LINKER_DCE option for architectures to select if they want
to build with -ffunction-sections, -fdata-sections, and link with
--gc-sections. It requires some work (documented) to ensure all
unreferenced entrypoints are live, and requires toolchain and
build verification, so it is made a per-arch option for now.

On a random powerpc64le build, this yelds a significant size saving,
it boots and runs fine, but there is a lot I haven't tested as yet,
so these savings may be reduced if there are bugs in the link.

    text      data        bss        dec   filename
11169741   1180744    1923176	14273661   vmlinux
10445269   1004127    1919707	13369103   vmlinux.dce

~700K text, ~170K data, 6% removed from kernel image size.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
I played with that too. However this needs distinct sections for 
exception tables and the like otherwise the backward references from the 
final exception table to those functions responsible for those exception 
entries has the effect of pulling in all those functions even if their 
entry point is never referenced, making --gc-sections less effective.  
I managed to fix this only with a change to gas (accepted upstream).

But once that is solved, you then have the missing forward reference 
problem i.e. nothing actually references those individual exception 
entry sections and ld happily drops them all. Having a KEEP() on each of 
them is unworkable and defeats the purpose anyway.  That requires a 
dummy reloc to trick ld into pulling in those sections when the parent 
section is also pulled in.
Right, although we don't *need* those things just for enabling
--gc-sections, do we? It may not be 100% optimal, but it's enough
to avoid the regression when switching to --whole-archive build
option.

Please see attached a subset of the slides I presented at ELC and Linaro 
Connect last year to illustrate those issues.

Also attached a sample patch partially implementing those changes.

In short I'm very glad to see that this might steer interest across 
multiple architectures.  I felt like this was becoming much more 
intrusive than I expected and that maybe LTO was a better bet after all. 
But LTO has its evils too and I'm willing to look at gc-sections again 
if there is interest from others as well.
Your results are impressive, and I don't want to stand in the way of
either LTO or improving accuracy of --gc-sections. But both are things
that can be built on top of this patch, I think. We don't need to do
the entire intrusive changes all at once.

Thanks,
Nick
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help