Thread (28 messages) 28 messages, 6 authors, 2023-08-05

Re: [PATCH v15 3/6] locking/qspinlock: Introduce CNA into the slow path of qspinlock

From: Peter Zijlstra <peterz@infradead.org>
Date: 2023-08-04 08:26:27
Also in: linux-arch, lkml

On Fri, Aug 04, 2023 at 09:33:48AM +0800, Guo Ren wrote:
On Thu, Aug 3, 2023 at 7:57 PM Peter Zijlstra [off-list ref] wrote:
quoted
CNA should only show a benefit when there is strong inter-node
contention, and in that case it is typically best to fix the kernel side
locking.

Hence the question as to what lock prompted you to look at this.
I met the long lock queue situation when the hardware gave an overly
aggressive store queue merge buffer delay mechanism. See:
https://lore.kernel.org/linux-riscv/20230802164701.192791-8-guoren@kernel.org/ (local)
*groan*, so you're using it to work around 'broken' hardware :-(

Wouldn't that hardware have horrifically bad lock throughput anyway?
Everybody would end up waiting on that store buffer delay.
This also let me consider improving the efficiency of the long lock
queue release. For example, if the queue is like this:

(Node0 cpu0) -> (Node1 cpu64) -> (Node0 cpu1) -> (Node1 cpu65) ->
(Node0 cpu2) -> (Node1 cpu66) -> ...

Then every mcs_unlock would cause a cross-NUMA transaction. But if we
could make the queue like this:
See, this is where the ARM64 WFE would come in handy; I don't suppose
RISC-V has anything like that?

Also, by the time you have 6 waiters, I'd say the lock is terribly
contended and you should look at improving the lockinh scheme.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help