Re: [PATCH 00/22] add support for Clang LTO
From: Peter Zijlstra <peterz@infradead.org>
Date: 2020-06-25 08:24:59
Also in:
linux-arch, linux-kbuild, linux-pci, lkml
On Thu, Jun 25, 2020 at 10:03:13AM +0200, Peter Zijlstra wrote:
On Wed, Jun 24, 2020 at 02:31:36PM -0700, Nick Desaulniers wrote:quoted
On Wed, Jun 24, 2020 at 2:15 PM Peter Zijlstra [off-list ref] wrote:quoted
On Wed, Jun 24, 2020 at 01:31:38PM -0700, Sami Tolvanen wrote:quoted
This patch series adds support for building x86_64 and arm64 kernels with Clang's Link Time Optimization (LTO). In addition to performance, the primary motivation for LTO is to allow Clang's Control-Flow Integrity (CFI) to be used in the kernel. Google's Pixel devices have shipped with LTO+CFI kernels since 2018. Most of the patches are build system changes for handling LLVM bitcode, which Clang produces with LTO instead of ELF object files, postponing ELF processing until a later stage, and ensuring initcall ordering. Note that first objtool patch in the series is already in linux-next, but as it's needed with LTO, I'm including it also here to make testing easier.I'm very sad that yet again, memory ordering isn't addressed. LTO vastly increases the range of the optimizer to wreck things.Hi Peter, could you expand on the issue for the folks on the thread? I'm happy to try to hack something up in LLVM if we check that X does or does not happen; maybe we can even come up with some concrete test cases that can be added to LLVM's codebase?I'm sure Will will respond, but the basic issue is the trainwreck C11 made of dependent loads. Anyway, here's a link to the last time this came up: https://lore.kernel.org/linux-arm-kernel/20171116174830.GX3624@linux.vnet.ibm.com/ (local)
Another good read: https://lore.kernel.org/lkml/20150520005510.GA23559@linux.vnet.ibm.com/ (local) and having (partially) re-read that, I now worry intensily about things like latch_tree_find(), cyc2ns_read_begin, __ktime_get_fast_ns(). It looks like kernel/time/sched_clock.c uses raw_read_seqcount() which deviates from the above patterns by, for some reason, using a primitive that includes an extra smp_rmb(). And this is just the few things I could remember off the top of my head, who knows what else is out there. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel