Thread (99 messages) 99 messages, 10 authors, 2021-06-14

Re: [PATCH v9] pgo: add clang's Profile Guided Optimization infrastructure

From: Bill Wendling <morbo@google.com>
Date: 2021-06-12 17:26:28
Also in: linux-doc, lkml

On Sat, Jun 12, 2021 at 9:59 AM Peter Zijlstra [off-list ref] wrote:
On Wed, Apr 07, 2021 at 02:17:04PM -0700, Bill Wendling wrote:
quoted
From: Sami Tolvanen <samitolvanen@google.com>

Enable the use of clang's Profile-Guided Optimization[1]. To generate a
profile, the kernel is instrumented with PGO counters, a representative
workload is run, and the raw profile data is collected from
/sys/kernel/debug/pgo/profraw.

The raw profile data must be processed by clang's "llvm-profdata" tool
before it can be used during recompilation:

  $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
  $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw

Multiple raw profiles may be merged during this step.

The data can now be used by the compiler:

  $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...

This initial submission is restricted to x86, as that's the platform we
know works. This restriction can be lifted once other platforms have
been verified to work with PGO.
*sigh*, and not a single x86 person on Cc, how nice :-/
This tool is generic and, despite the fact that it's first enabled for
x86, it contains no x86-specific code. The reason we're restricting it
to x86 is because it's the platform we tested on.
quoted
Note that this method of profiling the kernel is clang-native, unlike
the clang support in kernel/gcov.

[1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
Also, and I don't see this answered *anywhere*, why are you not using
perf for this? Your link even mentions Sampling Profilers (and I happen
to know there's been significant effort to make perf output work as
input for the PGO passes of the various compilers).
Instruction-based (non-sampling) profiling gives us a better
context-sensitive profile, making PGO more impactful. It's also useful
for coverage whereas sampling profiles cannot.
quoted
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Co-developed-by: Bill Wendling <morbo@google.com>
Signed-off-by: Bill Wendling <morbo@google.com>
Tested-by: Nick Desaulniers <redacted>
Reviewed-by: Nick Desaulniers <redacted>
Reviewed-by: Fangrui Song <redacted>
---
 Documentation/dev-tools/index.rst     |   1 +
 Documentation/dev-tools/pgo.rst       | 127 +++++++++
 MAINTAINERS                           |   9 +
 Makefile                              |   3 +
 arch/Kconfig                          |   1 +
 arch/x86/Kconfig                      |   1 +
 arch/x86/boot/Makefile                |   1 +
 arch/x86/boot/compressed/Makefile     |   1 +
 arch/x86/crypto/Makefile              |   4 +
 arch/x86/entry/vdso/Makefile          |   1 +
 arch/x86/kernel/vmlinux.lds.S         |   2 +
 arch/x86/platform/efi/Makefile        |   1 +
 arch/x86/purgatory/Makefile           |   1 +
 arch/x86/realmode/rm/Makefile         |   1 +
 arch/x86/um/vdso/Makefile             |   1 +
 drivers/firmware/efi/libstub/Makefile |   1 +
 include/asm-generic/vmlinux.lds.h     |  34 +++
 kernel/Makefile                       |   1 +
 kernel/pgo/Kconfig                    |  35 +++
 kernel/pgo/Makefile                   |   5 +
 kernel/pgo/fs.c                       | 389 ++++++++++++++++++++++++++
 kernel/pgo/instrument.c               | 189 +++++++++++++
 kernel/pgo/pgo.h                      | 203 ++++++++++++++
 scripts/Makefile.lib                  |  10 +
 24 files changed, 1022 insertions(+)
 create mode 100644 Documentation/dev-tools/pgo.rst
 create mode 100644 kernel/pgo/Kconfig
 create mode 100644 kernel/pgo/Makefile
 create mode 100644 kernel/pgo/fs.c
 create mode 100644 kernel/pgo/instrument.c
 create mode 100644 kernel/pgo/pgo.h
quoted
--- a/Makefile
+++ b/Makefile
@@ -660,6 +660,9 @@ endif # KBUILD_EXTMOD
 # Defaults to vmlinux, but the arch makefile usually adds further targets
 all: vmlinux

+CFLAGS_PGO_CLANG := -fprofile-generate
+export CFLAGS_PGO_CLANG
+
 CFLAGS_GCOV  := -fprofile-arcs -ftest-coverage \
      $(call cc-option,-fno-tree-loop-im) \
      $(call cc-disable-warning,maybe-uninitialized,)
And which of the many flags in noinstr disables this?
These flags aren't used with PGO. So there's no need to disable them.
Basically I would like to NAK this whole thing until someone can
adequately explain the interaction with noinstr and why we need those
many lines of kernel code and can't simply use perf for this.
-bw
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help