Thread (99 messages) 99 messages, 10 authors, 2021-06-14

Re: [PATCH v9] pgo: add clang's Profile Guided Optimization infrastructure

From: Peter Zijlstra <peterz@infradead.org>
Date: 2021-06-12 20:31:52
Also in: linux-kbuild, lkml

On Sat, Jun 12, 2021 at 01:20:15PM -0700, Fangrui Song wrote:
For applications, I think instrumentation based PGO can be 1%~4% faster
than sample-based PGO (e.g. AutoFDO) on x86.
Why? What specifically is missed by sample-based? I thought that LBR
augmented samples were very useful for exactly this.
Sample-based PGO has CPU requirement (e.g. Performance Monitoring Unit).
(my gut feeling is that there may be larger gap between instrumentation
based PGO and sample-based PGO for aarch64/ppc64, even though they can
use sample-based PGO.)
Instrumentation based PGO can be ported to more architectures.
Every architecture that cares about performance had better have a
hardware PMU. Both argh64 and ppc64 have one.
In addition, having an infrastructure for instrumentation based PGO
makes it easy to deploy newer techniques like context-sensitive PGO
(just changed compile options; it doesn't need new source level
annotation).
What's this context sensitive stuff you speak of? The link provided
earlier is devoid of useful information.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help