Thread (34 messages) 34 messages, 6 authors, 2026-04-03

Re: [PATCH v10 00/12] barrier: Add smp_cond_load_{relaxed,acquire}_timeout()

From: Catalin Marinas <catalin.marinas@arm.com>
Date: 2026-03-26 15:40:04
Also in: bpf, linux-arch, linux-pm, lkml

On Wed, Mar 25, 2026 at 08:23:57PM +0000, David Laight wrote:
On Wed, 25 Mar 2026 16:32:49 +0000
Catalin Marinas [off-list ref] wrote:
quoted
On Wed, Mar 25, 2026 at 03:42:10PM +0000, David Laight wrote:
...
quoted
quoted
Looking at the code I think the "sevl; wfe" pair should be higher up.  
Yes, I replied to your other message. We could move it higher indeed,
before the condition check, but I can't get my head around the ordering.
Can need_resched() check be speculated before the WFE? I need to think
some more.
I don't think speculation can matter.
Both SEVL and WFE must be serialised against any other instructions
that can change the event flag (as well as each other) otherwise
everything is broken.
Welcome to the Arm memory model. We don't have any guarantee that an LDR
will only access memory after SEVL+WFE. They are not serialising.
Apart from that it doesn't matter, what matters is the instruction
boundary the interrupt happens at.
True. If an interrupt is taken before the LDR (that would be a
need_resched() check for example), then a prior WFE would not matter.
This won't work if we replace the IPI with a SEV though (suggested
somewhere in this thread).
Actually both SEVL and WFE may be synchronising instructions and very slow.
Most likely not.
So you may not want to put them in the fast path where the condition
is true on entry (or even true after a retry).
So the code might have to look like:
	for (;;) {
		VAL = mem;
If we only waited on the location passed to LDXR, things would have been
much simpler. But the osq_lock() also wants to wait on the TIF flags via
need_resched() (and vcpu_is_preempted()).
		if (cond(VAL)) return;
So the cond(VAL) here is actually a series of other memory loads
unrelated to 'mem'
		SEVL; WFE;
		if (cond(VAL)) return;
I think this will work in principle even if 'cond' accesses other memory
locations, though I wouldn't bother with an additional 'cond' call, I'd
expect SEVL+WFE to be mostly NOPs. However, 'cond' must not set a local
event, otherwise the power saving on waiting is gone.
		v1 = LDX(mem);
		if (v1 == VAL)
			WFE;
	}
I think it's cleaner to use Ankur's timeout API here for the very rare
case where an IPI hits at the wrong time. We then keep
smp_cond_load_relaxed() intact as it's really not meant to wait on
multiple memory locations to change. Any changes of
smp_cond_load_relaxed() with moving the WFE around are just hacks, not
the intended use of this API.

-- 
Catalin
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help