Re: [PATCH v5 13/36] vmlinux.lds.h: add PGO and AutoFDO input sections
From: Kees Cook <hidden>
Date: 2020-08-21 19:18:22
Also in:
linux-arm-kernel, linux-efi, lkml, stable
From: Kees Cook <hidden>
Date: 2020-08-21 19:18:22
Also in:
linux-arm-kernel, linux-efi, lkml, stable
On Tue, Aug 04, 2020 at 12:06:49PM -0400, Arvind Sankar wrote:
On Mon, Aug 03, 2020 at 09:45:32PM -0700, Andi Kleen wrote:quoted
quoted
Why is that? Both .text and .text.hot have alignment of 2^4 (default function alignment on x86) by default, so it doesn't seem like it should matter for packing density. Avoiding interspersing cold text amongYou may lose part of a cache line on each unit boundary. Linux has a lot of units, some of them small. All these bytes add up.Separating out .text.unlikely, which isn't aligned, slightly _reduces_ this loss, but not by much -- just over 1K on a defconfig. More importantly, it moves cold code out of line (~320k on a defconfig), giving better code density for the hot code. For .text and .text.hot, you lose the alignment padding on every function boundary, not unit boundary, because of the 16-byte alignment. Whether .text.hot and .text are arranged by translation unit or not makes no difference. With *(.text.hot) *(.text) you get HHTT, with *(.text.hot .text) you get HTHT, but in both cases the individual chunks are already aligned to 16 bytes. If .text.hot _had_ different alignment requirements to .text, the HHTT should actually give better packing in general, I think.
Okay, so at the end of the conversation, I think it looks like this patch is correct: it collects the hot, unlikely, etc into their own areas (e.g. HHTTUU is more correct than HTUHTU), so this patch stands as-is. -- Kees Cook