Re: [PATCH v14 3/6] locking/qspinlock: Introduce CNA into the slow path of qspinlock
From: Peter Zijlstra <peterz@infradead.org>
Date: 2021-04-13 11:30:53
Also in:
linux-arch, lkml
On Thu, Apr 01, 2021 at 11:31:53AM -0400, Alex Kogan wrote:
+/* + * cna_splice_tail -- splice the next node from the primary queue onto + * the secondary queue. + */ +static void cna_splice_next(struct mcs_spinlock *node, + struct mcs_spinlock *next, + struct mcs_spinlock *nnext)
You forgot to update the comment when you changed the name on this thing.
+/*
+ * cna_order_queue - check whether the next waiter in the main queue is on
+ * the same NUMA node as the lock holder; if not, and it has a waiter behind
+ * it in the main queue, move the former onto the secondary queue.
+ */
+static void cna_order_queue(struct mcs_spinlock *node)
+{
+ struct mcs_spinlock *next = READ_ONCE(node->next);
+ struct cna_node *cn = (struct cna_node *)node;
+ int numa_node, next_numa_node;
+
+ if (!next) {
+ cn->partial_order = LOCAL_WAITER_NOT_FOUND;
+ return;
+ }
+
+ numa_node = cn->numa_node;
+ next_numa_node = ((struct cna_node *)next)->numa_node;
+
+ if (next_numa_node != numa_node) {
+ struct mcs_spinlock *nnext = READ_ONCE(next->next);
+
+ if (nnext) {
+ cna_splice_next(node, next, nnext);
+ next = nnext;
+ }
+ /*
+ * Inherit NUMA node id of primary queue, to maintain the
+ * preference even if the next waiter is on a different node.
+ */
+ ((struct cna_node *)next)->numa_node = numa_node;
+ }
+}
So the obvious change since last time I looked a this is that it now
only looks 1 entry ahead. Which makes sense I suppose.
I'm not really a fan of the 'partial_order' name combined with that
silly enum { LOCAL_WAITER_FOUND, LOCAL_WAITER_NOT_FOUND }. That's just
really bad naming all around. The enum is about having a waiter while
the variable is about partial order, that doesn't match at all.
If you rename the variable to 'has_waiter' and simply use 0,1 values,
things would be ever so more readable. But I don't think that makes
sense, see below.
I'm also not sure about that whole numa_node thing, why would you
over-write the numa node, why at this point ?
+
+/* Abuse the pv_wait_head_or_lock() hook to get some work done */
+static __always_inline u32 cna_wait_head_or_lock(struct qspinlock *lock,
+ struct mcs_spinlock *node)
+{
+ /*
+ * Try and put the time otherwise spent spin waiting on
+ * _Q_LOCKED_PENDING_MASK to use by sorting our lists.
+ */
+ cna_order_queue(node);
+
+ return 0; /* we lied; we didn't wait, go do so now */So here we inspect one entry ahead and then quit. I can't rmember, but did we try something like: /* * Try and put the time otherwise spent spin waiting on * _Q_LOCKED_PENDING_MASK to use by sorting our lists. * Move one entry at a go until either the list is fully * sorted or we ran out of spin condition. */ while (READ_ONCE(lock->val) & _Q_LOCKED_PENDING_MASK && node->partial_order) cna_order_queue(node); return 0; This will keep moving @next to the remote list until such a time that we're forced to continue or @next is local.
+}
+
+static inline void cna_lock_handoff(struct mcs_spinlock *node,
+ struct mcs_spinlock *next)
+{
+ struct cna_node *cn = (struct cna_node *)node;
+ u32 val = 1;
+
+ u32 partial_order = cn->partial_order;
+
+ if (partial_order == LOCAL_WAITER_NOT_FOUND)
+ cna_order_queue(node);
+AFAICT this is where playing silly games with ->numa_node belong; but right along with that goes a comment that describes why any of that makes sense. I mean, if you leave your node, for any reason, why bother coming back to it, why not accept it is a sign of the gods you're overdue for a node-change? Was the efficacy of this scheme tested?
+ /* + * We have a local waiter, either real or fake one; + * reload @next in case it was changed by cna_order_queue(). + */ + next = node->next; + if (node->locked > 1) + val = node->locked; /* preseve secondary queue */
IIRC we used to do: val |= node->locked; Which is simpler for not having branches. Why change a good thing?
+ + arch_mcs_lock_handoff(&next->locked, val); +}
_______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel