Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation
From: Will Deacon <hidden>
Date: 2015-10-08 12:59:44
Also in:
linux-arch, lkml
On Thu, Oct 08, 2015 at 01:16:38PM +0200, Peter Zijlstra wrote:
On Thu, Oct 08, 2015 at 02:50:36PM +1100, Michael Ellerman wrote:quoted
On Wed, 2015-10-07 at 08:25 -0700, Paul E. McKenney wrote:quoted
quoted
Currently, we do need smp_mb__after_unlock_lock() to be after the acquisition on PPC -- putting it between the unlock and the lock of course doesn't cut it for the cross-thread unlock/lock case.This ^, that makes me think I don't understand smp_mb__after_unlock_lock. How is: UNLOCK x smp_mb__after_unlock_lock() LOCK y a problem? That's still a full barrier.
I thought Paul was talking about something like this case:
CPU A CPU B CPU C
foo = 1
UNLOCK x
LOCK x
(RELEASE) bar = 1
ACQUIRE bar = 1
READ_ONCE foo = 0
but this looks the same as ISA2+lwsyncs/ISA2+lwsync+ctrlisync+lwsync,
which are both forbidden on PPC, so now I'm also confused.
The different-lock, same thread case is more straight-forward, I think.
quoted
quoted
I am with Peter -- we do need the benchmark results for PPC.Urgh, sorry guys. I have been slowly doing some benchmarks, but time is not plentiful at the moment. If we do a straight lwsync -> sync conversion for unlock it looks like that will cost us ~4.2% on Anton's standard context switch benchmark.
Thanks Michael!
And that does not seem to agree with Paul's smp_mb__after_unlock_lock() usage and would not be sufficient for the same (as of yet unexplained) reason. Why does it matter which of the LOCK or UNLOCK gets promoted to full barrier on PPC in order to become RCsc?
I think we need a PPC litmus test illustrating the inter-thread, same lock failure case when smp_mb__after_unlock_lock is not present so that we can reason about this properly. Paul? Will