Re: [PATCH v10 00/12] barrier: Add smp_cond_load_{relaxed,acquire}_timeout()
From: Catalin Marinas <catalin.marinas@arm.com>
Date: 2026-03-26 15:40:04
Also in:
bpf, linux-arch, linux-pm, lkml
On Wed, Mar 25, 2026 at 08:23:57PM +0000, David Laight wrote:
On Wed, 25 Mar 2026 16:32:49 +0000 Catalin Marinas [off-list ref] wrote:quoted
On Wed, Mar 25, 2026 at 03:42:10PM +0000, David Laight wrote:...quoted
quoted
Looking at the code I think the "sevl; wfe" pair should be higher up.Yes, I replied to your other message. We could move it higher indeed, before the condition check, but I can't get my head around the ordering. Can need_resched() check be speculated before the WFE? I need to think some more.I don't think speculation can matter. Both SEVL and WFE must be serialised against any other instructions that can change the event flag (as well as each other) otherwise everything is broken.
Welcome to the Arm memory model. We don't have any guarantee that an LDR will only access memory after SEVL+WFE. They are not serialising.
Apart from that it doesn't matter, what matters is the instruction boundary the interrupt happens at.
True. If an interrupt is taken before the LDR (that would be a need_resched() check for example), then a prior WFE would not matter. This won't work if we replace the IPI with a SEV though (suggested somewhere in this thread).
Actually both SEVL and WFE may be synchronising instructions and very slow.
Most likely not.
So you may not want to put them in the fast path where the condition
is true on entry (or even true after a retry).
So the code might have to look like:
for (;;) {
VAL = mem;If we only waited on the location passed to LDXR, things would have been much simpler. But the osq_lock() also wants to wait on the TIF flags via need_resched() (and vcpu_is_preempted()).
if (cond(VAL)) return;
So the cond(VAL) here is actually a series of other memory loads unrelated to 'mem'
SEVL; WFE; if (cond(VAL)) return;
I think this will work in principle even if 'cond' accesses other memory locations, though I wouldn't bother with an additional 'cond' call, I'd expect SEVL+WFE to be mostly NOPs. However, 'cond' must not set a local event, otherwise the power saving on waiting is gone.
v1 = LDX(mem); if (v1 == VAL) WFE; }
I think it's cleaner to use Ankur's timeout API here for the very rare case where an IPI hits at the wrong time. We then keep smp_cond_load_relaxed() intact as it's really not meant to wait on multiple memory locations to change. Any changes of smp_cond_load_relaxed() with moving the WFE around are just hacks, not the intended use of this API. -- Catalin