Thread (99 messages) 99 messages, 10 authors, 2021-06-14

Re: [PATCH] pgo: add clang's Profile Guided Optimization infrastructure

From: Bill Wendling <morbo@google.com>
Date: 2021-01-11 21:33:59
Also in: linux-kbuild, lkml

On Mon, Jan 11, 2021 at 1:18 PM Nick Desaulniers
[off-list ref] wrote:
On Mon, Jan 11, 2021 at 1:04 PM Nathan Chancellor
[off-list ref] wrote:
quoted
On Mon, Jan 11, 2021 at 12:18:21AM -0800, Bill Wendling wrote:
quoted
From: Sami Tolvanen <samitolvanen@google.com>

Enable the use of clang's Profile-Guided Optimization[1]. To generate a
profile, the kernel is instrumented with PGO counters, a representative
workload is run, and the raw profile data is collected from
/sys/kernel/debug/pgo/profraw.

The raw profile data must be processed by clang's "llvm-profdata" tool before
it can be used during recompilation:

  $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
  $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw

Multiple raw profiles may be merged during this step.

The data can be used either by the compiler if LTO isn't enabled:

    ... -fprofile-use=vmlinux.profdata ...

or by LLD if LTO is enabled:

    ... -lto-cs-profile-file=vmlinux.profdata ...

This initial submission is restricted to x86, as that's the platform we know
works. This restriction can be lifted once other platforms have been verified
to work with PGO.

Note that this method of profiling the kernel is clang-native and isn't
compatible with clang's gcov support in kernel/gcov.

[1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Co-developed-by: Bill Wendling <morbo@google.com>
Signed-off-by: Bill Wendling <morbo@google.com>
I took this for a spin against x86_64_defconfig and ran into two issues:

1. https://github.com/ClangBuiltLinux/linux/issues/1252
"Cannot split an edge from a CallBrInst"
Looks like that should be fixed first, then we should gate this
feature on clang-12.
Weird. I'll investigate.
quoted
   There is also one in drivers/gpu/drm/i915/i915_query.c. For the time
   being, I added PGO_PROFILE_... := n for those two files.

2. After doing that, I run into an undefined function error with ld.lld.

How I tested:

$ make -skj"$(nproc)" LLVM=1 defconfig

$ scripts/config -e PGO_CLANG

$ make -skj"$(nproc)" LLVM=1 olddefconfig vmlinux all
ld.lld: error: undefined symbol: __llvm_profile_instrument_memop
Err...that seems like it should be implemented in
kernel/pgo/instrument.c in this patch in a v2?
Yes. I'll submit a new V2 with this and other feedback integrated.
quoted
quoted
quoted
quoted
referenced by head64.c
              arch/x86/kernel/head64.o:(__early_make_pgtable)
referenced by head64.c
              arch/x86/kernel/head64.o:(x86_64_start_kernel)
referenced by head64.c
              arch/x86/kernel/head64.o:(copy_bootdata)
referenced 2259 more times
Local diff:
diff --git a/drivers/char/Makefile b/drivers/char/Makefile
index ffce287ef415..4b2f238770b5 100644
--- a/drivers/char/Makefile
+++ b/drivers/char/Makefile
@@ -4,6 +4,7 @@
 #

 obj-y                          += mem.o random.o
+PGO_PROFILE_random.o           := n
 obj-$(CONFIG_TTY_PRINTK)       += ttyprintk.o
 obj-y                          += misc.o
 obj-$(CONFIG_ATARI_DSP56K)     += dsp56k.o
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index e5574e506a5c..d83cacc79b1a 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -168,6 +168,7 @@ i915-y += \
          i915_vma.o \
          intel_region_lmem.o \
          intel_wopcm.o
+PGO_PROFILE_i915_query.o := n

 # general-purpose microcontroller (GuC) support
 i915-y += gt/uc/intel_uc.o \
I'd rather have these both sorted out before landing with PGO disabled
on these files.
Agreed.

-bw
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help