Re: MMIO and gcc re-ordering issue
From: Trent Piepho <hidden>
Date: 2008-06-04 06:40:21
Also in:
linux-arch, lkml
On Wed, 4 Jun 2008, Nick Piggin wrote:
On Wednesday 04 June 2008 07:44, Trent Piepho wrote:quoted
On Tue, 3 Jun 2008, Matthew Wilcox wrote:quoted
quoted
I don't understand why you keep talking about DMA. Are you talking about ordering between readX() and DMA? PCI proides those guarantees.I guess you haven't been reading the whole thread. The reason it started was because gcc can re-order powerpc (and everyone else's too) IO accesses vs accesses to cachable memory (but not spin-locks), which ends up only being a problem with coherent DMA.I don't think it is only a problem with coherent DMA. CPU0 CPU1 mutex_lock(mutex); writel(something, DATA_REG); writel(GO, CTRL_REG); started = 1;
(A)
mutex_unlock(mutex);
mutex_lock(mutex);(B)
if (started)
/* oops, this can reach device before GO */
writel(STOP, CTRL_REG);The locks themselves should have (and do have) ordering operations to insure gcc and/or the cpu can't move a store or load outside the locked region. Generally you need that to keep stores/loads to cacheable memory inside the critical area, much less I/O operations. Otherwise all you have to do is replace writel(something, ...) with shared_data->something = ... and there's an obvious problem. In your example, gcc currently can and will move the GO operation to point A (if it can figure out that CTRL_REG and started aren't aliased), but that's not a problem. If it could move it to B that would be a problem, but it can't. Other than coherent DMA, I don't think there is any reason to care if I/O accessors are strongly ordered wrt load/stores to cacheable memory. locking and streaming DMA sync operations already need to have ordering, so they don't require all I/O to be ordered wrt all cacheable memory.