Thread (19 messages) 19 messages, 4 authors, 2010-03-01
STALE5950d

[PATCH 1/4] ARM: Change the mandatory barriers implementation

From: catalin.marinas@arm.com (Catalin Marinas)
Date: 2010-02-26 15:43:50

On Tue, 2010-02-23 at 18:03 +0000, Russell King - ARM Linux wrote:
On Tue, Feb 23, 2010 at 04:02:35PM +0000, Catalin Marinas wrote:
quoted
quoted
I'm not entirely convinced by the part of your patch which changes the
SMP barriers yet.  For instance, some drivers contain:

                /* We need for force the visibility of tp->intr_mask
                 * for other CPUs, as we can loose an MSI interrupt
                 * and potentially wait for a retransmit timeout if we don't.
                 * The posted write to IntrMask is safe, as it will
                 * eventually make it to the chip and we won't loose anything
                 * until it does.
                 */
                tp->intr_mask = 0xffff;
                smp_wmb();
                RTL_W16(IntrMask, tp->intr_event);

The second write is a write to hardware, and thus would be to a device
region.  The first is a write to a memory structure.

It seems to me given your description in the patch, that having smp_wmb()
be a dmb(), rather than a wmb() would be insufficient here.
[...]
Given what you've said, it would appear that smp_wmb() needs to be a
wmb() in the SMP case, to ensure that the write to intr_mask is
visible to other CPUs before the interrupt mask write hits the
peripheral.

So, that leads us back to the:

#ifndef CONFIG_SMP
#define smp_mb()        barrier()
#define smp_rmb()       barrier()
#define smp_wmb()       barrier()
#else
#define smp_mb()        mb()
#define smp_rmb()       rmb()
#define smp_wmb()       wmb()
#endif
A better implementation would be this:

#ifndef CONFIG_SMP
#define smp_mb()	barrier()
#define smp_rmb()	barrier()
#define smp_wmb()	barrier()
#else
#define smp_mb()	dsb()
#define smp_rmb()	mb()
#define smp_wmb()	dsb()
#endif

Since the mb() may have other effects like draining the L2 write buffer
which is definitely not needed for the SMP barriers.

Anyway, the above change to smp_*mb() would probably have a performance
impact especially with spinlocks.

I can see that the driver situation you described appears in other
drivers as well. Whether this is a correct usage model I can't tell. It
may be worth going with this on linux-arch. PowerPC for example uses a
light barrier for the smp_wmb() case which doesn't ensure ordering
between accesses to normal vs I/O memory.

-- 
Catalin
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help