Thread (41 messages) 41 messages, 8 authors, 2020-12-09

Re: [PATCH v8 02/16] kbuild: add support for Clang LTO

From: Masahiro Yamada <masahiroy@kernel.org>
Date: 2020-12-02 03:01:35
Also in: linux-arch, linux-kbuild, linux-pci, lkml

On Wed, Dec 2, 2020 at 6:37 AM 'Sami Tolvanen' via Clang Built Linux
[off-list ref] wrote:
quoted hunk ↗ jump to hunk
This change adds build system support for Clang's Link Time
Optimization (LTO). With -flto, instead of ELF object files, Clang
produces LLVM bitcode, which is compiled into native code at link
time, allowing the final binary to be optimized globally. For more
details, see:

  https://llvm.org/docs/LinkTimeOptimization.html

The Kconfig option CONFIG_LTO_CLANG is implemented as a choice,
which defaults to LTO being disabled. To use LTO, the architecture
must select ARCH_SUPPORTS_LTO_CLANG and support:

  - compiling with Clang,
  - compiling inline assembly with Clang's integrated assembler,
  - and linking with LLD.

While using full LTO results in the best runtime performance, the
compilation is not scalable in time or memory. CONFIG_THINLTO
enables ThinLTO, which allows parallel optimization and faster
incremental builds. ThinLTO is used by default if the architecture
also selects ARCH_SUPPORTS_THINLTO:

  https://clang.llvm.org/docs/ThinLTO.html

To enable LTO, LLVM tools must be used to handle bitcode files. The
easiest way is to pass the LLVM=1 option to make:

  $ make LLVM=1 defconfig
  $ scripts/config -e LTO_CLANG
  $ make LLVM=1

Alternatively, at least the following LLVM tools must be used:

  CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm

To prepare for LTO support with other compilers, common parts are
gated behind the CONFIG_LTO option, and LTO can be disabled for
specific files by filtering out CC_FLAGS_LTO.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <redacted>
---
 Makefile                          | 19 ++++++-
 arch/Kconfig                      | 88 +++++++++++++++++++++++++++++++
 include/asm-generic/vmlinux.lds.h | 11 ++--
 scripts/Makefile.build            |  9 +++-
 scripts/Makefile.modfinal         |  9 +++-
 scripts/Makefile.modpost          | 21 +++++++-
 scripts/link-vmlinux.sh           | 32 ++++++++---
 7 files changed, 171 insertions(+), 18 deletions(-)
diff --git a/Makefile b/Makefile
index 16b7f0890e75..f5cac2428efc 100644
--- a/Makefile
+++ b/Makefile
@@ -891,6 +891,21 @@ KBUILD_CFLAGS      += $(CC_FLAGS_SCS)
 export CC_FLAGS_SCS
 endif

+ifdef CONFIG_LTO_CLANG
+ifdef CONFIG_LTO_CLANG_THIN
+CC_FLAGS_LTO   += -flto=thin -fsplit-lto-unit
+KBUILD_LDFLAGS += --thinlto-cache-dir=$(extmod-prefix).thinlto-cache
+else
+CC_FLAGS_LTO   += -flto
+endif
+CC_FLAGS_LTO   += -fvisibility=default
+endif
+
+ifdef CONFIG_LTO
+KBUILD_CFLAGS  += $(CC_FLAGS_LTO)
+export CC_FLAGS_LTO
+endif
+
 ifdef CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_32B
 KBUILD_CFLAGS += -falign-functions=32
 endif
@@ -1471,7 +1486,7 @@ MRPROPER_FILES += include/config include/generated          \
                  *.spec

 # Directories & files removed with 'make distclean'
-DISTCLEAN_FILES += tags TAGS cscope* GPATH GTAGS GRTAGS GSYMS
+DISTCLEAN_FILES += tags TAGS cscope* GPATH GTAGS GRTAGS GSYMS .thinlto-cache

 # clean - Delete most, but leave enough to build external modules
 #
@@ -1717,7 +1732,7 @@ PHONY += compile_commands.json

 clean-dirs := $(KBUILD_EXTMOD)
 clean: rm-files := $(KBUILD_EXTMOD)/Module.symvers $(KBUILD_EXTMOD)/modules.nsdeps \
-       $(KBUILD_EXTMOD)/compile_commands.json
+       $(KBUILD_EXTMOD)/compile_commands.json $(KBUILD_EXTMOD)/.thinlto-cache

 PHONY += help
 help:
diff --git a/arch/Kconfig b/arch/Kconfig
index 56b6ccc0e32d..30907b554451 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -598,6 +598,94 @@ config SHADOW_CALL_STACK
          reading and writing arbitrary memory may be able to locate them
          and hijack control flow by modifying the stacks.

+config LTO
+       bool
+       help
+         Selected if the kernel will be built using the compiler's LTO feature.
+
+config LTO_CLANG
+       bool
+       select LTO
+       help
+         Selected if the kernel will be built using Clang's LTO feature.
+
+config ARCH_SUPPORTS_LTO_CLANG
+       bool
+       help
+         An architecture should select this option if it supports:
+         - compiling with Clang,
+         - compiling inline assembly with Clang's integrated assembler,
+         - and linking with LLD.
+
+config ARCH_SUPPORTS_LTO_CLANG_THIN
+       bool
+       help
+         An architecture should select this option if it can support Clang's
+         ThinLTO mode.
+
+config HAS_LTO_CLANG
+       def_bool y
+       # Clang >= 11: https://github.com/ClangBuiltLinux/linux/issues/510
+       depends on CC_IS_CLANG && CLANG_VERSION >= 110000 && LD_IS_LLD
+       depends on $(success,$(NM) --help | head -n 1 | grep -qi llvm)
+       depends on $(success,$(AR) --help | head -n 1 | grep -qi llvm)
+       depends on ARCH_SUPPORTS_LTO_CLANG
+       depends on !FTRACE_MCOUNT_USE_RECORDMCOUNT
+       depends on !KASAN
+       depends on !GCOV_KERNEL
+       depends on !MODVERSIONS
+       help
+         The compiler and Kconfig options support building with Clang's
+         LTO.
+
+choice
+       prompt "Link Time Optimization (LTO)"
+       default LTO_NONE
+       help
+         This option enables Link Time Optimization (LTO), which allows the
+         compiler to optimize binaries globally.
+
+         If unsure, select LTO_NONE. Note that LTO is very resource-intensive
+         so it's disabled by default.
+
+config LTO_NONE
+       bool "None"
+       help
+         Build the kernel normally, without Link Time Optimization (LTO).
+
+config LTO_CLANG_FULL
+       bool "Clang Full LTO (EXPERIMENTAL)"
+       depends on HAS_LTO_CLANG
+       select LTO_CLANG
+       help
+          This option enables Clang's full Link Time Optimization (LTO), which
+          allows the compiler to optimize the kernel globally. If you enable
+          this option, the compiler generates LLVM bitcode instead of ELF
+          object files, and the actual compilation from bitcode happens at
+          the LTO link step, which may take several minutes depending on the
+          kernel configuration. More information can be found from LLVM's
+          documentation:
+
+           https://llvm.org/docs/LinkTimeOptimization.html
+
This help document is misleading.
People who read the document would misunderstand how great this feature would.

This should be added in the commit log and Kconfig help:

            In contrast to the example in the documentation, Clang LTO
            for the kernel cannot remove any unreachable function or data.
            In fact, this results in even bigger vmlinux and modules.




-- 
Best Regards
Masahiro Yamada

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help