Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb()

[PATCH 0/4] Add lightweight memory barriers fast_rmb() and fast_wmb() · Alexander Duyck <hidden> · 2014-11-17
[PATCH 1/4] arch: Cleanup read_barrier_depends() and comments · Alexander Duyck <hidden> · 2014-11-17
[PATCH 3/4] r8169: Use fast_rmb() and fast_wmb() for DescOwn checks · Alexander Duyck <hidden> · 2014-11-17
[PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Alexander Duyck <hidden> · 2014-11-17
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2014-11-17
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Alexander Duyck <hidden> · 2014-11-17
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2014-11-18
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Alexander Duyck <hidden> · 2014-11-18
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Will Deacon <hidden> · 2014-11-18
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Alexander Duyck <hidden> · 2014-11-18
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Will Deacon <hidden> · 2014-11-18
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2014-11-18
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Paul E. McKenney <hidden> · 2014-11-17
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Alexander Duyck <hidden> · 2014-11-17
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Paul E. McKenney <hidden> · 2014-11-17
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Alexander Duyck <hidden> · 2014-11-18
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2014-11-18
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Linus Torvalds <torvalds@linux-foundation.org> · 2014-11-17
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Alexander Duyck <hidden> · 2014-11-17
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2014-11-18
Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb() · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2014-11-18
[PATCH 4/4] fm10k/igb/ixgbe: Use fast_rmb on Rx descriptor reads · Alexander Duyck <hidden> · 2014-11-17
Re: [PATCH 4/4] fm10k/igb/ixgbe: Use fast_rmb on Rx descriptor reads · Jeff Kirsher <hidden> · 2014-11-17
RE: [PATCH 0/4] Add lightweight memory barriers fast_rmb() and fast_wmb() · David Laight <hidden> · 2014-11-18
Re: [PATCH 0/4] Add lightweight memory barriers fast_rmb() and fast_wmb() · Alexander Duyck <hidden> · 2014-11-18

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: 2014-11-18 22:35:55
Also in: linux-arch, lkml

On Mon, 2014-11-17 at 19:13 -0800, Alexander Duyck wrote:

ARM adds some funky things.  They have two different types of 
primitives, a dmb() which is a data memory barrier, and a dsb() which is 
a data synchronization barrier.  Then with each of those they have the 
"domains" the barriers are effective within.

So for example on ARM a rmb() is dsb(sy) which means it is a system wide 
synchronization barrier which stops execution on the CPU core until the 
read completes.

That's amazingly heavy handed ... I can see that being useful for MMIO,
we do something similar in our MMIO accessors by using a special variant
of trap instruction that never traps to make the core thing the load
value has been consumed. But that's typically only needed to guarantee
MMIO timings.

However the smp_rmb() is a dmb(ish) which means it is 
only a barrier as far as the inner shareable domain which I believe only 
goes as far as the local shared cache hierarchy and only guarantees read 
ordering without necessarily halting the CPU or stopping in-order 
speculative reads.  So what a coherent_rmb() would be in my setup is 
dmb(sy) which means the barrier runs all the way out to memory, and it 
is allowed to speculative read as long as it does it in order.

Correct, which is thus the same as smp_rmb() ... which was my original
point, or am I missing something else ?

If it is still unclear you might check out Will Deacon's talk on the 
topic at https://www.youtube.com/watch?v=6ORn6_35kKo, at about 7:00 in 
he explains the whole domains thing, and at 13:30 he explains dmb()/dsb().

Ok, I'll try to watch that when I get a chance.

Cheers,
Ben.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help