Re: [PATCH 2/5] kbuild: allow archs to select build for link dead code/data elimination
From: Nicholas Piggin <npiggin@gmail.com>
Date: 2016-08-08 03:42:45
Also in:
linux-kbuild, linuxppc-dev
On Sun, 7 Aug 2016 01:33:45 -0400 (EDT) Nicolas Pitre [off-list ref] wrote:
On Fri, 5 Aug 2016, Nicholas Piggin wrote:quoted
Introduce LINKER_DCE option for architectures to select if they want to build with -ffunction-sections, -fdata-sections, and link with --gc-sections. It requires some work (documented) to ensure all unreferenced entrypoints are live, and requires toolchain and build verification, so it is made a per-arch option for now. On a random powerpc64le build, this yelds a significant size saving, it boots and runs fine, but there is a lot I haven't tested as yet, so these savings may be reduced if there are bugs in the link. text data bss dec filename 11169741 1180744 1923176 14273661 vmlinux 10445269 1004127 1919707 13369103 vmlinux.dce ~700K text, ~170K data, 6% removed from kernel image size. Signed-off-by: Nicholas Piggin <npiggin@gmail.com>I played with that too. However this needs distinct sections for exception tables and the like otherwise the backward references from the final exception table to those functions responsible for those exception entries has the effect of pulling in all those functions even if their entry point is never referenced, making --gc-sections less effective. I managed to fix this only with a change to gas (accepted upstream). But once that is solved, you then have the missing forward reference problem i.e. nothing actually references those individual exception entry sections and ld happily drops them all. Having a KEEP() on each of them is unworkable and defeats the purpose anyway. That requires a dummy reloc to trick ld into pulling in those sections when the parent section is also pulled in.
Right, although we don't *need* those things just for enabling --gc-sections, do we? It may not be 100% optimal, but it's enough to avoid the regression when switching to --whole-archive build option.
Please see attached a subset of the slides I presented at ELC and Linaro Connect last year to illustrate those issues. Also attached a sample patch partially implementing those changes. In short I'm very glad to see that this might steer interest across multiple architectures. I felt like this was becoming much more intrusive than I expected and that maybe LTO was a better bet after all. But LTO has its evils too and I'm willing to look at gc-sections again if there is interest from others as well.
Your results are impressive, and I don't want to stand in the way of either LTO or improving accuracy of --gc-sections. But both are things that can be built on top of this patch, I think. We don't need to do the entire intrusive changes all at once. Thanks, Nick