Thread (26 messages) 26 messages, 10 authors, 2021-07-29

Re: [PATCH RFC 1/2] arch: Introduce ARCH_HAS_HW_XCHG_SMALL

From: Boqun Feng <hidden>
Date: 2021-07-26 17:25:08

On Tue, Jul 27, 2021 at 12:41:34AM +0800, Guo Ren wrote:
On Mon, Jul 26, 2021 at 6:39 PM Boqun Feng [off-list ref] wrote:
quoted
On Mon, Jul 26, 2021 at 04:56:49PM +0800, Huacai Chen wrote:
quoted
Hi, Geert,

On Mon, Jul 26, 2021 at 4:36 PM Geert Uytterhoeven [off-list ref] wrote:
quoted
Hi Huacai,

On Sat, Jul 24, 2021 at 2:36 PM Huacai Chen [off-list ref] wrote:
quoted
Introduce a new Kconfig option ARCH_HAS_HW_XCHG_SMALL, which means arch
has hardware sub-word xchg/cmpxchg support. This option will be used as
an indicator to select the bit-field definition in the qspinlock data
structure.

Signed-off-by: Huacai Chen <redacted>
Thanks for your patch!
quoted
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -228,6 +228,10 @@ config ARCH_HAS_FORTIFY_SOURCE
          An architecture should select this when it can successfully
          build and run with CONFIG_FORTIFY_SOURCE.

+# Select if arch has hardware sub-word xchg/cmpxchg support
+config ARCH_HAS_HW_XCHG_SMALL
What do you mean by "hardware"?
Does a software fallback count?
This new option is supposed as an indicator to select bit-field
definition of qspinlock, software fallback is not helpful in this
case.
I don't think this is true. IIUC, the rationale of the config is that
for some architectures, since the architectural cmpxchg doesn't provide
forward-progress guarantee then using cmpxchg of machine-word to
implement xchg{8,16}() may cause livelock, therefore these architectures
don't want to provide xchg{8,16}(), as a result they cannot work with
qspinlock when _Q_PENDING_BITS is 8.

So as long as an architecture can provide and has already provided an
implementation of xchg{8,16}() which guarantee forward-progress (even
though the implementation is using a machine-word size cmpxchg), the
architecture doesn't need to select ARCH_HAS_HW_XCHG_SMALL.
Seems only atomic could provide forward progress, isn't it? And lr/sc
couldn't implement xchg/cmpxchg primitive properly.
I'm missing you point here, a) ll/sc can provide forward progress and b)
ll/sc instructions are used to implement xchg/cmpxchg (see ARM64 and
PPC).
How to make CPU guarantee  "load + cmpxchg" forward-progress? Fusion
these instructions and lock the snoop channel?
Maybe hardware guys would think that it's easier to implement cas +
dcas + amo(short & byte).
Please note that if _Q_PENDING_BITS == 1, then the xchg_tail() is
implemented as a "load + cmpxchg", so if "load + cmpxchg" implementation
of xchg16() doesn't provide forward-progress in an architecture, neither
does xchg_tail().

Regards,
Boqun
quoted
Regards,
Boqun
quoted
quoted
quoted
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -5,6 +5,7 @@ config M68K
        select ARCH_32BIT_OFF_T
        select ARCH_HAS_BINFMT_FLAT
        select ARCH_HAS_DMA_PREP_COHERENT if HAS_DMA && MMU && !COLDFIRE
+       select ARCH_HAS_HW_XCHG_SMALL
M68k CPUs which support the CAS (Compare And Set) instruction do
support this on 8-bit, 16-bit, and 32-bit quantities.
M68k CPUs which lack CAS use a software implementation, which
supports the same quantities.

As CAS is used only if CONFIG_RMW_INSNS=y, perhaps this needs
a dependency?
OK, I think this dependency is needed.

Huacai
quoted
   select ARCH_HAS_HW_XCHG_SMALL if RMW_INSNS

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds


-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help