Re: [PATCH v3 3/5] arm64: atomics: avoid out-of-line ll/sc atomics
From: Mark Rutland <mark.rutland@arm.com>
Date: 2019-08-22 17:02:05
On Mon, Aug 12, 2019 at 03:36:23PM +0100, Andrew Murray wrote:
When building for LSE atomics (CONFIG_ARM64_LSE_ATOMICS), if the hardware or toolchain doesn't support it the existing code will fallback to ll/sc atomics. It achieves this by branching from inline assembly to a function that is built with specical compile flags. Further this results in the clobbering of registers even when the fallback isn't used increasing register pressure. Let's improve this by providing inline implementations of both LSE and ll/sc and use a static key to select between them. This allows for the compiler to generate better atomics code. To improve icache performance for the LL/SC fallback atomics, we put them in their own subsection. Please note that as atomic_arch.h is included indirectly by kernel.h (via bitops.h), we cannot depend on features provided later in the kernel.h file. This prevents us from placing the system_uses_lse_atomics function in cpu_feature.h due to its dependencies. Signed-off-by: Andrew Murray <redacted>
[...]
quoted hunk ↗ jump to hunk
diff --git a/arch/arm64/include/asm/atomic_arch.h b/arch/arm64/include/asm/atomic_arch.h new file mode 100644 index 000000000000..255a284321c6 --- /dev/null +++ b/arch/arm64/include/asm/atomic_arch.h@@ -0,0 +1,154 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Selection between LSE and LL/SC atomics. + * + * Copyright (C) 2018 ARM Ltd. + * Author: Andrew Murray <andrew.murray@arm.com> + */ + +#ifndef __ASM_ATOMIC_ARCH_H +#define __ASM_ATOMIC_ARCH_H + +#include <asm/atomic_lse.h> +#include <asm/atomic_ll_sc.h> + +#include <linux/jump_label.h> +#include <asm/cpucaps.h>
I'm guessing that we have to include the <asm/atomic_*> headers first due to the include dependencies. If that's the case, could we please have a comment here to that effect? Minor nit, but could we also order those two alphabetically, please? The general style is to have headers alphabetically, with (for reasons unknown) the <linux/*> headers before the <asm/*> headers. [...]
+#if IS_ENABLED(CONFIG_ARM64_LSE_ATOMICS) && IS_ENABLED(CONFIG_AS_LSE) +#define __LL_SC_FALLBACK(asm_ops) \ +" b 3f\n" \ +" .subsection 1\n" \ +"3:\n" \ +asm_ops "\n" \ +" b 4f\n" \ +" .previous\n" \ +"4:\n" +#else +#define __LL_SC_FALLBACK(asm_ops) asm_ops #endif
Can we instead make the ll/sc functions with the cold attribute (wrapped by <linux/compiler.h> as __cold)? IIUC that should have a similar effect, and might allow GCC to do better (e.g. merging compatible instances of the ll/sc code in the same cold subsection). https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/Common-Function-Attributes.html#index-cold-function-attribute Otherwise, this is looking much nicer! Thanks, Mark. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel