Re: [PATCH] pgo: add clang's Profile Guided Optimization infrastructure
From: Bill Wendling <morbo@google.com>
Date: 2021-01-12 00:39:56
Also in:
linux-kbuild, lkml
On Mon, Jan 11, 2021 at 12:31 PM Fangrui Song [off-list ref] wrote:
On 2021-01-11, Bill Wendling wrote:quoted
On Mon, Jan 11, 2021 at 12:12 PM Fangrui Song [off-list ref] wrote:quoted
On 2021-01-11, 'Bill Wendling' via Clang Built Linux wrote:quoted
From: Sami Tolvanen <samitolvanen@google.com> Enable the use of clang's Profile-Guided Optimization[1]. To generate a profile, the kernel is instrumented with PGO counters, a representative workload is run, and the raw profile data is collected from /sys/kernel/debug/pgo/profraw. The raw profile data must be processed by clang's "llvm-profdata" tool before it can be used during recompilation: $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw Multiple raw profiles may be merged during this step. The data can be used either by the compiler if LTO isn't enabled: ... -fprofile-use=vmlinux.profdata ... or by LLD if LTO is enabled: ... -lto-cs-profile-file=vmlinux.profdata ...This LLD option does not exist. LLD does have some `--lto-*` options but the `-lto-*` form is not supported (it clashes with -l) https://reviews.llvm.org/D79371That's strange. I've been using that option for years now. :-) Is this a recent change?The more frequently used options (specifyed by the clang driver) are -plugin-opt=... (options implemented by LLVMgold.so). `-lto-*` is rare.quoted
quoted
(There is an earlier -fprofile-instr-generate which does instrumentation in Clang, but the option does not have broad usage. It is used more for code coverage, not for optimization. Noticeably, it does not even implement the Kirchhoff's current law optimization)Right. I've been told outside of this email that -fprofile-generate is the prefered flag to use.quoted
-fprofile-use= is used by both regular PGO and context-sensitive PGO (CSPGO). clang -flto=thin -fprofile-use= passes -plugin-opt=cs-profile-path= to the linker. For regular PGO, this option is effectively a no-op (confirmed with CSPGO main developer). So I think the "or by LLD if LTO is enabled:" part should be removed.But what if you specify the linking step explicitly? Linux doesn't call "clang" when linking, but "ld.lld".Regular PGO+LTO does not need -plugin-opt=cs-profile-path= CSPGO+LTO needs it. Because -fprofile-use= may be used by both, Clang driver adds it. CSPGO is relevant in this this patch, so the linker option does not need to be mentioned.
I'm still a bit confused. Are you saying that when clang uses `-flto=thin -fprofile-use=foo` that the profile file "foo" is embedded into the bitcode file so that when the linker's run it'll be used? This is the workflow: clang ... -fprofile-use=vmlinux.profdata ... -c -o foo.o foo.c clang ... -fprofile-use=vmlinux.profdata ... -c -o bar.o bar.c ld.lld ... <output file> foo.o bar.o Are you saying that we don't need to have "-plugin-opt=cs-profile-path=vmlinux.profdata" on the "ld.lld ..." line? -bw