Re: [PATCH v9] pgo: add clang's Profile Guided Optimization infrastructure
From: Fangrui Song <hidden>
Date: 2021-06-12 20:21:42
Also in:
linux-doc, lkml
On 2021-06-12, Peter Zijlstra wrote:
On Sat, Jun 12, 2021 at 10:25:57AM -0700, Bill Wendling wrote:quoted
On Sat, Jun 12, 2021 at 9:59 AM Peter Zijlstra [off-list ref] wrote:quoted
Also, and I don't see this answered *anywhere*, why are you not using perf for this? Your link even mentions Sampling Profilers (and I happen to know there's been significant effort to make perf output work as input for the PGO passes of the various compilers).Instruction-based (non-sampling) profiling gives us a better context-sensitive profile, making PGO more impactful. It's also useful for coverage whereas sampling profiles cannot.We've got KCOV and GCOV support already. Coverage is also not an argument mentioned anywhere else. Coverage can go pound sand, we really don't need a third means of getting that. Do you have actual numbers that back up the sampling vs instrumented argument? Having the instrumentation will affect performance which can scew the profile just the same. Also, sampling tends to capture the hot spots very well.
[I don't do kernel development. My experience is user-space toolchain.] For applications, I think instrumentation based PGO can be 1%~4% faster than sample-based PGO (e.g. AutoFDO) on x86. Sample-based PGO has CPU requirement (e.g. Performance Monitoring Unit). (my gut feeling is that there may be larger gap between instrumentation based PGO and sample-based PGO for aarch64/ppc64, even though they can use sample-based PGO.) Instrumentation based PGO can be ported to more architectures. In addition, having an infrastructure for instrumentation based PGO makes it easy to deploy newer techniques like context-sensitive PGO (just changed compile options; it doesn't need new source level annotation).