Thread (25 messages) 25 messages, 8 authors, 2014-11-18

Re: [PATCH 2/4] arch: Add lightweight memory barriers fast_rmb() and fast_wmb()

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: 2014-11-18 00:40:00
Also in: linux-arch, lkml

On Mon, 2014-11-17 at 12:18 -0800, Paul E. McKenney wrote:
On Mon, Nov 17, 2014 at 09:18:13AM -0800, Alexander Duyck wrote:
quoted
There are a number of situations where the mandatory barriers rmb() and
wmb() are used to order memory/memory operations in the device drivers
and those barriers are much heavier than they actually need to be.  For
example in the case of PowerPC wmb() calls the heavy-weight sync
instruction when for memory/memory operations all that is really needed is
an lsync or eieio instruction.
Is this still the case if one of the memory operations is MMIO?  Last
I knew, it was not.
I *think* (Alexander, correct me if I'm wrong), that what he wants is
the memory<->memory barriers (the smp_* ones) basically for ordering his
loads or stores from/to the DMA area.

The problem is that the smp_* ones aren't compiled for !CONFIG_SMP

IE. Something like:

  - Read valid bit from descriptor

  - Read rest of descriptor

That needs an rmb of some sort in between, but a full blown "rmb" will
also order vs. MMIOs and end up being a full sync, while an smp_rmb is a
lwsync which is more lightweight.

Similarily:

 - Populate descriptor

 - Write valid bit

Same deal with wmb ...

Basically, rmb and wmb order both cachable and non-cachable (memory and
MMIO) which makes them needlessly heavy on powerpc and possibly others
when all you need is to order memory accesses to some DMA data
structures. In that case you really want the normal smp_* variants
except they may not be around...

Cheers,
Ben.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help