Thread (101 messages) 101 messages, 19 authors, 2008-06-12

Re: MMIO and gcc re-ordering issue

From: Trent Piepho <hidden>
Date: 2008-06-04 06:40:21
Also in: linux-arch, lkml

On Wed, 4 Jun 2008, Nick Piggin wrote:
On Wednesday 04 June 2008 07:44, Trent Piepho wrote:
quoted
On Tue, 3 Jun 2008, Matthew Wilcox wrote:
quoted
quoted
I don't understand why you keep talking about DMA.  Are you talking
about ordering between readX() and DMA?  PCI proides those guarantees.
I guess you haven't been reading the whole thread.  The reason it started
was because gcc can re-order powerpc (and everyone else's too) IO accesses
vs accesses to cachable memory (but not spin-locks), which ends up only
being a problem with coherent DMA.
I don't think it is only a problem with coherent DMA.

CPU0                         CPU1
mutex_lock(mutex);
writel(something, DATA_REG);
writel(GO, CTRL_REG);
started = 1;
 	(A)
mutex_unlock(mutex);
                            mutex_lock(mutex);
 	(B)
                            if (started)
                              /* oops, this can reach device before GO */
                              writel(STOP, CTRL_REG);
The locks themselves should have (and do have) ordering operations to insure
gcc and/or the cpu can't move a store or load outside the locked region. 
Generally you need that to keep stores/loads to cacheable memory inside the
critical area, much less I/O operations.  Otherwise all you have to do is
replace writel(something, ...) with shared_data->something = ...  and there's
an obvious problem.  In your example, gcc currently can and will move the GO
operation to point A (if it can figure out that CTRL_REG and started aren't
aliased), but that's not a problem.  If it could move it to B that would be a
problem, but it can't.

Other than coherent DMA, I don't think there is any reason to care if I/O
accessors are strongly ordered wrt load/stores to cacheable memory.  locking
and streaming DMA sync operations already need to have ordering, so they don't
require all I/O to be ordered wrt all cacheable memory.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help