Re: [PATCH v15 3/6] locking/qspinlock: Introduce CNA into the slow path of qspinlock
From: Peter Zijlstra <peterz@infradead.org>
Date: 2023-08-04 08:26:27
Also in:
linux-arch, lkml
On Fri, Aug 04, 2023 at 09:33:48AM +0800, Guo Ren wrote:
On Thu, Aug 3, 2023 at 7:57 PM Peter Zijlstra [off-list ref] wrote:
quoted
CNA should only show a benefit when there is strong inter-node contention, and in that case it is typically best to fix the kernel side locking. Hence the question as to what lock prompted you to look at this.I met the long lock queue situation when the hardware gave an overly aggressive store queue merge buffer delay mechanism. See: https://lore.kernel.org/linux-riscv/20230802164701.192791-8-guoren@kernel.org/ (local)
*groan*, so you're using it to work around 'broken' hardware :-( Wouldn't that hardware have horrifically bad lock throughput anyway? Everybody would end up waiting on that store buffer delay.
This also let me consider improving the efficiency of the long lock queue release. For example, if the queue is like this: (Node0 cpu0) -> (Node1 cpu64) -> (Node0 cpu1) -> (Node1 cpu65) -> (Node0 cpu2) -> (Node1 cpu66) -> ... Then every mcs_unlock would cause a cross-NUMA transaction. But if we could make the queue like this:
See, this is where the ARM64 WFE would come in handy; I don't suppose RISC-V has anything like that? Also, by the time you have 6 waiters, I'd say the lock is terribly contended and you should look at improving the lockinh scheme. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel