Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and... | linuxppc-dev

[PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics · Boqun Feng <hidden> · 2015-10-12
[PATCH v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier · Boqun Feng <hidden> · 2015-10-12
Re: [PATCH v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier · Boqun Feng <hidden> · 2015-10-12
[PATCH v3 2/6] atomics: Add test for atomic operations with _relaxed variants · Boqun Feng <hidden> · 2015-10-12
[PATCH v3 3/6] atomics: Allow architectures to define their own __atomic_op_* helpers · Boqun Feng <hidden> · 2015-10-12
[PATCH v3 4/6] powerpc: atomic: Implement atomic{,64}_*_return_* variants · Boqun Feng <hidden> · 2015-10-12
Re: [PATCH v3 4/6] powerpc: atomic: Implement atomic{,64}_*_return_* variants · Will Deacon <hidden> · 2015-10-13
Re: [PATCH v3 4/6] powerpc: atomic: Implement atomic{,64}_*_return_* variants · Boqun Feng <hidden> · 2015-10-13
Re: [PATCH v3 4/6] powerpc: atomic: Implement atomic{,64}_*_return_* variants · Boqun Feng <hidden> · 2015-10-14
[PATCH v3 5/6] powerpc: atomic: Implement xchg_* and atomic{,64}_xchg_* variants · Boqun Feng <hidden> · 2015-10-12
[PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants · Boqun Feng <hidden> · 2015-10-12
Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants · Will Deacon <hidden> · 2015-10-13
Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants · Boqun Feng <hidden> · 2015-10-13
Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants · Will Deacon <hidden> · 2015-10-13
Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants · Boqun Feng <hidden> · 2015-10-13
Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants · Will Deacon <hidden> · 2015-10-13
Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants · Boqun Feng <hidden> · 2015-10-13
Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants · Boqun Feng <hidden> · 2015-10-14
Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants · Will Deacon <hidden> · 2015-10-14
Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants · Boqun Feng <hidden> · 2015-10-13
[PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier · Boqun Feng <hidden> · 2015-10-12
Re: [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier · Michael Ellerman <mpe@ellerman.id.au> · 2015-10-14
Re: [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier · Boqun Feng <hidden> · 2015-10-14
Re: [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier · Peter Zijlstra <peterz@infradead.org> · 2015-10-14
Re: [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier · Boqun Feng <hidden> · 2015-10-14
Re: [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier · Peter Zijlstra <peterz@infradead.org> · 2015-10-14
Re: [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier · Michael Ellerman <mpe@ellerman.id.au> · 2015-10-14
Re: [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics · Peter Zijlstra <peterz@infradead.org> · 2015-10-13
Re: [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics · Paul E. McKenney <hidden> · 2015-10-13

Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants

From: Will Deacon <hidden>
Date: 2015-10-13 14:43:36
Also in: lkml

On Tue, Oct 13, 2015 at 10:32:59PM +0800, Boqun Feng wrote:

On Tue, Oct 13, 2015 at 02:24:04PM +0100, Will Deacon wrote:

quoted

On Mon, Oct 12, 2015 at 10:14:06PM +0800, Boqun Feng wrote:

quoted

Implement cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed, based on
which _release variants can be built.

To avoid superfluous barriers in _acquire variants, we implement these
operations with assembly code rather use __atomic_op_acquire() to build
them automatically.

The "superfluous barriers" are for the case where the cmpxchg fails, right?

Yes.

quoted

And you don't do the same thing for release, because you want to avoid a
barrier in the middle of the critical section?

Mostly because of the comments in include/linux/atomic.h:

 * For compound atomics performing both a load and a store, ACQUIRE
 * semantics apply only to the load and RELEASE semantics only to the
 * store portion of the operation. Note that a failed cmpxchg_acquire
 * does -not- imply any memory ordering constraints.

so I thought only the barrier in cmpxchg_acquire() is conditional, and
the barrier in cmpxchg_release() is not. Maybe we'd better call it out
that cmpxchg *family* doesn't have any order guarantee if cmp fails, as
a complement of

ed2de9f74ecb ("locking/Documentation: Clarify failed cmpxchg() memory ordering semantics")

Because it seems this commit only claims that the barriers in fully
ordered version are conditional.

I didn't think this was ambiguous... A failed cmpxchg_release doesn't
perform a store, so because the RELEASE semantics only apply to the
store portion of the operation, it therefore doesn't have any ordering
guarantees. Acquire is called out as a special case because it *does*
actually perform a load on the failure case.

If cmpxchg_release doesn't have order guarantee when failed, I guess I
can implement it with a barrier in the middle as you mentioned:

	unsigned int prev;

	__asm__ __volatile__ (
"1:	lwarx	%0,0,%2		
	cmpw	0,%0,%3\n\
	bne-	2f\n"
	PPC_RELEASE_BARRIER
"	stwcx.	%4,0,%2\n\
	bne-	1b"
	"\n\
2:"
	: "=&r" (prev), "+m" (*p)
	: "r" (p), "r" (old), "r" (new)
	: "cc", "memory");

	return prev;


However, I need to check whether the architecture allows this and any
other problem exists.

Besides, I don't think it's a good idea to do the "put barrier in the
middle" thing in this patchset, because that seems a premature
optimization and if we go further, I guess we can also replace the
PPC_RELEASE_BARRIER above with a "sync" to implement a fully ordered
version cmpxchg(). Too much needs to investigate then..

Putting a barrier in the middle of that critical section is probably a
terrible idea, and that's why I thought you were avoiding it (hence my
original question). Perhaps just add a comment to that effect, since I
fear adding more words to memory-barriers.txt is just likely to create
further confusion.

Will

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help