Thread (63 messages) 63 messages, 5 authors, 2020-08-21

Re: [PATCH v5 13/36] vmlinux.lds.h: add PGO and AutoFDO input sections

From: Kees Cook <hidden>
Date: 2020-08-21 19:18:22
Also in: linux-arm-kernel, linux-efi, lkml, stable

On Tue, Aug 04, 2020 at 12:06:49PM -0400, Arvind Sankar wrote:
On Mon, Aug 03, 2020 at 09:45:32PM -0700, Andi Kleen wrote:
quoted
quoted
Why is that? Both .text and .text.hot have alignment of 2^4 (default
function alignment on x86) by default, so it doesn't seem like it should
matter for packing density.  Avoiding interspersing cold text among
You may lose part of a cache line on each unit boundary. Linux has 
a lot of units, some of them small. All these bytes add up.
Separating out .text.unlikely, which isn't aligned, slightly _reduces_
this loss, but not by much -- just over 1K on a defconfig. More
importantly, it moves cold code out of line (~320k on a defconfig),
giving better code density for the hot code.

For .text and .text.hot, you lose the alignment padding on every
function boundary, not unit boundary, because of the 16-byte alignment.
Whether .text.hot and .text are arranged by translation unit or not
makes no difference.

With *(.text.hot) *(.text) you get HHTT, with *(.text.hot .text) you get
HTHT, but in both cases the individual chunks are already aligned to 16
bytes. If .text.hot _had_ different alignment requirements to .text, the
HHTT should actually give better packing in general, I think.
Okay, so at the end of the conversation, I think it looks like this
patch is correct: it collects the hot, unlikely, etc into their own
areas (e.g. HHTTUU is more correct than HTUHTU), so this patch stands
as-is.

-- 
Kees Cook
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help