Re: [PATCH v3 2/5] arm64: Use correct ll/sc atomic constraints
From: Andrew Murray <hidden>
Date: 2019-08-28 15:44:29
On Wed, Aug 28, 2019 at 04:25:40PM +0100, Mark Rutland wrote:
On Wed, Aug 28, 2019 at 02:01:19PM +0100, Andrew Murray wrote:quoted
On Thu, Aug 22, 2019 at 04:32:23PM +0100, Mark Rutland wrote:quoted
Hi Andrew, On Mon, Aug 12, 2019 at 03:36:22PM +0100, Andrew Murray wrote:quoted
For many of the ll/sc atomic operations we use the 'I' machine constraint regardless to the instruction used - this may not be optimal. Let's add an additional parameter to the ATOMIC_xx macros that allows the caller to specify an appropriate machine constraint. Let's also improve __CMPXCHG_CASE by replacing the 'K' constraint with a caller provided constraint. Please note that whilst we would like to use the 'K' constraint on 32 bit operations, we choose not to provide any constraint to avoid a GCC bug which results in a build error. Earlier versions of GCC (no later than 8.1.0) appear to incorrectly handle the 'K' constraint for the value 4294967295.From reading the above, it's difficult to discern what's a fix and what's an improvement, and I think we need to be more explicit about that. It would also be helpful to have the necessary context up-front. How about: | The A64 ISA accepts distinct (but overlapping) ranges of immediates for: | | * add arithmetic instructions ('I' machine constraint) | * sub arithmetic instructions ('J' machine constraint) | * 32-bit logical instructions ('K' machine constraint) | * 64-bit logical instructions ('L' machine constraint) | | ... but we currently use the 'I' constraint for many atomic operations | using sub or logical instructions, which is not always valid. | | When CONFIG_ARM64_LSE_ATOMICS is not set, this allows invalid immediates | to be passed to instructions, potentially resulting in a build failure. | When CONFIG_ARM64_LSE_ATOMICS is selected the out-of-line ll/sc atomics | always use a register as they have no visibility of the value passed by | the caller. | | This patch adds a constraint parameter to the ATOMIC_xx and | __CMPXCHG_CASE macros so that we can pass appropriate constraints for | each case, with uses updated accordingly. | | Unfortunately prior to GCC 8.1.0 the 'K' constraint erroneously accepted | 0xffffffff, so we must instead force the use of a register.Looks great - I'll adopt this, thanks for writing it.Cool. :)quoted
quoted
Given we haven't had any bug reports, I'm not sure whether this needs a Fixes tag or Cc stable. This has been a latent issue for a long time, but upstream code doesn't seem to have tickled it.Yes I guess this is more a correctness issue rather than a reproducible bug in upstream code. I won't add a fixes tag or CC to stable.Sounds good to me.quoted
quoted
[...]quoted
-ATOMIC_OPS(and, and) -ATOMIC_OPS(andnot, bic) -ATOMIC_OPS(or, orr) -ATOMIC_OPS(xor, eor) +ATOMIC_OPS(and, and, K) +ATOMIC_OPS(andnot, bic, ) +ATOMIC_OPS(or, orr, K) +ATOMIC_OPS(xor, eor, K)Surely it's not safe to use the K constraint here, either? AFAICT code like: atomic_xor(~0, &atom); ... would suffer from the same problem as described for cmpxchg.Thanks for spotting this. Yes, I think the resolution here (and for any 32bit bitmask immediate) is to drop the constraint. Do you agree that we should drop the 'K' constraint for both orr and eor above?Yes, I think we should drop the 'K' for all the 32-bit logical ops.
Ah yes, I keep getting 'and' and 'add' mixed up. OK I'll drop 'K' from all the above.
quoted
quoted
[...]quoted
-ATOMIC64_OPS(and, and) -ATOMIC64_OPS(andnot, bic) -ATOMIC64_OPS(or, orr) -ATOMIC64_OPS(xor, eor) +ATOMIC64_OPS(and, and, K) +ATOMIC64_OPS(andnot, bic, ) +ATOMIC64_OPS(or, orr, K) +ATOMIC64_OPS(xor, eor, K)Shouldn't these be 'L'? IIUC K is a subset of L, so if that's deliberate we should call that out explicitly...Oooh yes that's wrong. I guess the atomic64_[and,or,xor] are rarely called in the kernel which perhaps is why the compiler hasn't shouted at me. Do you agree that the and, orr and eor should all be 'L' instead of 'K'?Yes, I think all the 64-bit logical ops should all use 'L'.
With the exception of bic? I don't think there is an appropriate constraint for this (it requires an 8 bit immediate).
I did a quick local test, and the 'L' constraint seems to correctly reject ~0UL (i.e. it doesn't seem to have a similar bug to the 'K' constraint).
Yes, if I recall correctly the issue was only with 'K'.
quoted
quoted
quoted
+__CMPXCHG_CASE(w, b, , 8, , , , , ) +__CMPXCHG_CASE(w, h, , 16, , , , , ) +__CMPXCHG_CASE(w, , , 32, , , , , ) +__CMPXCHG_CASE( , , , 64, , , , , L) +__CMPXCHG_CASE(w, b, acq_, 8, , a, , "memory", ) +__CMPXCHG_CASE(w, h, acq_, 16, , a, , "memory", ) +__CMPXCHG_CASE(w, , acq_, 32, , a, , "memory", ) +__CMPXCHG_CASE( , , acq_, 64, , a, , "memory", L) +__CMPXCHG_CASE(w, b, rel_, 8, , , l, "memory", ) +__CMPXCHG_CASE(w, h, rel_, 16, , , l, "memory", ) +__CMPXCHG_CASE(w, , rel_, 32, , , l, "memory", ) +__CMPXCHG_CASE( , , rel_, 64, , , l, "memory", L) +__CMPXCHG_CASE(w, b, mb_, 8, dmb ish, , l, "memory", ) +__CMPXCHG_CASE(w, h, mb_, 16, dmb ish, , l, "memory", ) +__CMPXCHG_CASE(w, , mb_, 32, dmb ish, , l, "memory", ) +__CMPXCHG_CASE( , , mb_, 64, dmb ish, , l, "memory", L)... but these uses imply that's not the case.Yup, so I can leave these as they are.Yup; sounds good.
Thanks, Andrew Murray
Thanks, Mark.
_______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel