Thread (63 messages) 63 messages, 12 authors, 2021-04-07

RE: [PATCH v4 3/4] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

From: Anup Patel <hidden>
Date: 2021-03-30 04:55:24
Also in: linux-riscv, lkml

-----Original Message-----
From: Guo Ren <guoren@kernel.org>
Sent: 30 March 2021 08:44
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-riscv <redacted>; Linux Kernel Mailing List
[off-list ref]; linux-csky@vger.kernel.org; linux-arch
[off-list ref]; Guo Ren [off-list ref]; Will
Deacon [off-list ref]; Ingo Molnar [off-list ref]; Waiman
Long [off-list ref]; Arnd Bergmann [off-list ref]; Anup
Patel [off-list ref]
Subject: Re: [PATCH v4 3/4] locking/qspinlock: Add
ARCH_USE_QUEUED_SPINLOCKS_XCHG32

On Mon, Mar 29, 2021 at 8:50 PM Peter Zijlstra [off-list ref]
wrote:
quoted
On Mon, Mar 29, 2021 at 08:01:41PM +0800, Guo Ren wrote:
quoted
u32 a = 0x55aa66bb;
u16 *ptr = &a;

CPU0                       CPU1
=========             =========
xchg16(ptr, new)     while(1)
                                    WRITE_ONCE(*(ptr + 1), x);

When we use lr.w/sc.w implement xchg16, it'll cause CPU0 deadlock.
Then I think your LL/SC is broken.

That also means you really don't want to build super complex locking
primitives on top, because that live-lock will percolate through.
Do you mean the below implementation has live-lock risk?
+static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail)
+{
+       u32 old, new, val = atomic_read(&lock->val);
+
+       for (;;) {
+               new = (val & _Q_LOCKED_PENDING_MASK) | tail;
+               old = atomic_cmpxchg(&lock->val, val, new);
+               if (old == val)
+                       break;
+
+               val = old;
+       }
+       return old;
+}

quoted
Step 1 would be to get your architecute fixed such that it can provide
fwd progress guarantees for LL/SC. Otherwise there's absolutely no
point in building complex systems with it.
Quote Waiman's comment [1] on xchg16 optimization:

"This optimization is needed to make the qspinlock achieve performance
parity with ticket spinlock at light load."

[1] https://lore.kernel.org/kvm/1429901803-29771-6-git-send-email-
Waiman.Long@hp.com/

So for a non-xhg16 machine:
 - ticket-lock for small numbers of CPUs
 - qspinlock for large numbers of CPUs

Okay, I'll put all of them into the next patch 
I would suggest to have separate Kconfig opitons for ticket spinlock
in Linux RISC-V which will be disabled by default. This means Linux
RISC-V will use qspinlock by default and use ticket spinlock only when
ticket spinlock kconfig is enabled.

Regards,
Anup
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help