Re: [PATCH 18/18] arm64: lto: Strengthen READ_ONCE() to acquire when CLANG_LTO=y
From: "Paul E. McKenney" <paulmck@kernel.org>
Date: 2020-07-06 19:42:43
Also in:
linux-alpha, lkml, virtualization
On Mon, Jul 06, 2020 at 09:23:26PM +0200, Marco Elver wrote:
On Mon, 6 Jul 2020 at 20:35, Will Deacon [off-list ref] wrote:quoted
On Mon, Jul 06, 2020 at 05:00:23PM +0100, Dave Martin wrote:quoted
On Thu, Jul 02, 2020 at 08:23:02AM +0100, Will Deacon wrote:quoted
On Wed, Jul 01, 2020 at 06:07:25PM +0100, Dave P Martin wrote:quoted
Also, can you illustrate code that can only be unsafe with Clang LTO?I don't have a concrete example, but it's an ongoing concern over on the LTO thread [1], so I cooked this to show one way we could deal with it. The main concern is that the whole-program optimisations enabled by LTO may allow the compiler to enumerate possible values for a pointer at link time and replace an address dependency between two loads with a control dependency instead, defeating the dependency ordering within the CPU.Why can't that happen without LTO?It could, but I'd argue that it's considerably less likely because there is less information available to the compiler to perform these sorts of optimisations. It also doesn't appear to be happening in practice. The current state of affairs is that, if/when we catch the compiler performing harmful optimistations, we look for a way to disable them. However, there are good reasons to enable LTO, so this is one way to do that without having to worry about the potential impact on dependency ordering.If it's of any help, I'll see if we can implement that warning in LLVM if data dependencies somehow disappear (although I don't have any cycles to pursue right now myself). Until then, short of manual inspection or encountering a bug in the wild, there is no proof any of this happens or doesn't happen. Also, as some anecdotal evidence it's extremely unlikely, even with LTO: looking at the passes that LLVM runs, there are a number of passes that seem to want to eliminate basic blocks, thereby getting rid of branches. Intuitively, it makes sense, because branches are expensive on most architectures (for GPU targets, I think it tries even harder to get rid of branches). If we extend our reasoning and assumptions of LTO's aggressiveness in that direction, we might actually end up with fewer branches. That might be beneficial for the data dependencies we worry about (but not so much for control dependencies we want to keep). Still, no point in speculating (no pun intended) until we have hard data what actually happens. :-)
Anything along these lines would be very welcome!!! Thanx, Paul _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel