Re: [PATCH] pgo: add clang's Profile Guided Optimization infrastructure
From: Bill Wendling <morbo@google.com>
Date: 2021-01-11 21:33:59
Also in:
linux-kbuild, lkml
On Mon, Jan 11, 2021 at 1:18 PM Nick Desaulniers [off-list ref] wrote:
On Mon, Jan 11, 2021 at 1:04 PM Nathan Chancellor [off-list ref] wrote:quoted
On Mon, Jan 11, 2021 at 12:18:21AM -0800, Bill Wendling wrote:quoted
From: Sami Tolvanen <samitolvanen@google.com> Enable the use of clang's Profile-Guided Optimization[1]. To generate a profile, the kernel is instrumented with PGO counters, a representative workload is run, and the raw profile data is collected from /sys/kernel/debug/pgo/profraw. The raw profile data must be processed by clang's "llvm-profdata" tool before it can be used during recompilation: $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw Multiple raw profiles may be merged during this step. The data can be used either by the compiler if LTO isn't enabled: ... -fprofile-use=vmlinux.profdata ... or by LLD if LTO is enabled: ... -lto-cs-profile-file=vmlinux.profdata ... This initial submission is restricted to x86, as that's the platform we know works. This restriction can be lifted once other platforms have been verified to work with PGO. Note that this method of profiling the kernel is clang-native and isn't compatible with clang's gcov support in kernel/gcov. [1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Co-developed-by: Bill Wendling <morbo@google.com> Signed-off-by: Bill Wendling <morbo@google.com>I took this for a spin against x86_64_defconfig and ran into two issues: 1. https://github.com/ClangBuiltLinux/linux/issues/1252"Cannot split an edge from a CallBrInst" Looks like that should be fixed first, then we should gate this feature on clang-12.
Weird. I'll investigate.
quoted
There is also one in drivers/gpu/drm/i915/i915_query.c. For the time being, I added PGO_PROFILE_... := n for those two files. 2. After doing that, I run into an undefined function error with ld.lld. How I tested: $ make -skj"$(nproc)" LLVM=1 defconfig $ scripts/config -e PGO_CLANG $ make -skj"$(nproc)" LLVM=1 olddefconfig vmlinux all ld.lld: error: undefined symbol: __llvm_profile_instrument_memopErr...that seems like it should be implemented in kernel/pgo/instrument.c in this patch in a v2?
Yes. I'll submit a new V2 with this and other feedback integrated.
quoted
quoted
quoted
quoted
referenced by head64.c arch/x86/kernel/head64.o:(__early_make_pgtable) referenced by head64.c arch/x86/kernel/head64.o:(x86_64_start_kernel) referenced by head64.c arch/x86/kernel/head64.o:(copy_bootdata) referenced 2259 more timesLocal diff:diff --git a/drivers/char/Makefile b/drivers/char/Makefile index ffce287ef415..4b2f238770b5 100644 --- a/drivers/char/Makefile +++ b/drivers/char/Makefile@@ -4,6 +4,7 @@ # obj-y += mem.o random.o +PGO_PROFILE_random.o := n obj-$(CONFIG_TTY_PRINTK) += ttyprintk.o obj-y += misc.o obj-$(CONFIG_ATARI_DSP56K) += dsp56k.odiff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index e5574e506a5c..d83cacc79b1a 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile@@ -168,6 +168,7 @@ i915-y += \ i915_vma.o \ intel_region_lmem.o \ intel_wopcm.o +PGO_PROFILE_i915_query.o := n # general-purpose microcontroller (GuC) support i915-y += gt/uc/intel_uc.o \I'd rather have these both sorted out before landing with PGO disabled on these files.
Agreed. -bw