Thread (117 messages) 117 messages, 17 authors, 2013-11-11

Re: perf events ring buffer memory barrier on powerpc

From: Paul E. McKenney <hidden>
Date: 2013-11-04 10:53:01

On Mon, Nov 04, 2013 at 09:57:17AM +0000, Will Deacon wrote:
Hi Paul,

On Sun, Nov 03, 2013 at 10:47:12PM +0000, Paul E. McKenney wrote:
quoted
On Sun, Nov 03, 2013 at 05:07:59PM +0000, Will Deacon wrote:
quoted
On Sun, Nov 03, 2013 at 02:40:17PM +0000, Paul E. McKenney wrote:
quoted
On Sat, Nov 02, 2013 at 10:32:39AM -0700, Paul E. McKenney wrote:
quoted
On Fri, Nov 01, 2013 at 03:56:34PM +0100, Peter Zijlstra wrote:
quoted
On Wed, Oct 30, 2013 at 11:40:15PM -0700, Paul E. McKenney wrote:
quoted
quoted
Now the whole crux of the question is if we need barrier A at all, since
the STORES issued by the @buf writes are dependent on the ubuf->tail
read.
The dependency you are talking about is via the "if" statement?
Even C/C++11 is not required to respect control dependencies.

This one is a bit annoying.  The x86 TSO means that you really only
need barrier(), ARM (recent ARM, anyway) and Power could use a weaker
barrier, and so on -- but smp_mb() emits a full barrier.

Perhaps a new smp_tmb() for TSO semantics, where reads are ordered
before reads, writes before writes, and reads before writes, but not
writes before reads?  Another approach would be to define a per-arch
barrier for this particular case.
I suppose we can only introduce new barrier primitives if there's more
than 1 use-case.
Which barrier did you have in mind when you refer to `recent ARM' above? It
seems to me like you'd need a combination if dmb ishld and dmb ishst, since
the former doesn't order writes before writes.
I heard a rumor that ARM had recently added a new dmb variant that acted
similarly to PowerPC's lwsync, and it was on my list to follow up.

Given your response, I am guessing that there is no truth to this rumor...
I think you're talking about the -ld option to dmb, which was introduced in
ARMv8. That option orders loads against loads and stores, but doesn't order
writes against writes. So you could do:

	dmb ishld
	dmb ishst

but it's questionable whether that performs better than a dmb ish.
If Linus's smp_store_with_release_semantics() approach works out, ARM
should be able to use its shiny new ldar and stlr instructions.

							Thanx, Paul
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help