Thread (99 messages) 99 messages, 10 authors, 2021-06-14

Re: [PATCH] pgo: add clang's Profile Guided Optimization infrastructure

From: Nathan Chancellor <hidden>
Date: 2021-01-11 21:05:35
Also in: linux-kbuild, lkml
Subsystem: char and misc drivers, drm drivers, intel drm i915 driver (meteor lake, dg2 and older excluding poulsbo, moorestown and derivative), the rest · Maintainers: Arnd Bergmann, Greg Kroah-Hartman, David Airlie, Simona Vetter, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Linus Torvalds

On Mon, Jan 11, 2021 at 12:18:21AM -0800, Bill Wendling wrote:
From: Sami Tolvanen <samitolvanen@google.com>

Enable the use of clang's Profile-Guided Optimization[1]. To generate a
profile, the kernel is instrumented with PGO counters, a representative
workload is run, and the raw profile data is collected from
/sys/kernel/debug/pgo/profraw.

The raw profile data must be processed by clang's "llvm-profdata" tool before
it can be used during recompilation:

  $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
  $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw

Multiple raw profiles may be merged during this step.

The data can be used either by the compiler if LTO isn't enabled:

    ... -fprofile-use=vmlinux.profdata ...

or by LLD if LTO is enabled:

    ... -lto-cs-profile-file=vmlinux.profdata ...

This initial submission is restricted to x86, as that's the platform we know
works. This restriction can be lifted once other platforms have been verified
to work with PGO.

Note that this method of profiling the kernel is clang-native and isn't
compatible with clang's gcov support in kernel/gcov.

[1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Co-developed-by: Bill Wendling <morbo@google.com>
Signed-off-by: Bill Wendling <morbo@google.com>
I took this for a spin against x86_64_defconfig and ran into two issues:

1. https://github.com/ClangBuiltLinux/linux/issues/1252

   There is also one in drivers/gpu/drm/i915/i915_query.c. For the time
   being, I added PGO_PROFILE_... := n for those two files.

2. After doing that, I run into an undefined function error with ld.lld.

How I tested:

$ make -skj"$(nproc)" LLVM=1 defconfig

$ scripts/config -e PGO_CLANG

$ make -skj"$(nproc)" LLVM=1 olddefconfig vmlinux all
ld.lld: error: undefined symbol: __llvm_profile_instrument_memop
quoted
quoted
referenced by head64.c
              arch/x86/kernel/head64.o:(__early_make_pgtable)
referenced by head64.c
              arch/x86/kernel/head64.o:(x86_64_start_kernel)
referenced by head64.c
              arch/x86/kernel/head64.o:(copy_bootdata)
referenced 2259 more times
Local diff:
diff --git a/drivers/char/Makefile b/drivers/char/Makefile
index ffce287ef415..4b2f238770b5 100644
--- a/drivers/char/Makefile
+++ b/drivers/char/Makefile
@@ -4,6 +4,7 @@
 #
 
 obj-y				+= mem.o random.o
+PGO_PROFILE_random.o		:= n
 obj-$(CONFIG_TTY_PRINTK)	+= ttyprintk.o
 obj-y				+= misc.o
 obj-$(CONFIG_ATARI_DSP56K)	+= dsp56k.o
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index e5574e506a5c..d83cacc79b1a 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -168,6 +168,7 @@ i915-y += \
 	  i915_vma.o \
 	  intel_region_lmem.o \
 	  intel_wopcm.o
+PGO_PROFILE_i915_query.o := n
 
 # general-purpose microcontroller (GuC) support
 i915-y += gt/uc/intel_uc.o \
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help