Thread (26 messages) 26 messages, 7 authors, 2011-08-30
STALE5391d

[PATCH] usb: ehci: fix update qtd->token in qh_append_tds

From: stern@rowland.harvard.edu (Alan Stern)
Date: 2011-08-29 13:57:51
Also in: linux-omap

On Mon, 29 Aug 2011, Russell King - ARM Linux wrote:
quoted
You know better than I do what is needed to resolve the ordering issue.  
However, contrary to what the original patch description said, this
isn't entirely a matter of making the write visible to the host
controller: No doubt in time the write will eventually become visible
anyway.  It's a matter of making the write become visible reasonably
quickly and in the correct order with respect to other writes.
I'm not entirely sure what the problem is - I think its about a write
by the CPU to dma coherent memory being delayed and not being visible
to the HC in a timely manner.  Either mb() or wmb() placed after the
write on ARM will do that - and ARM has no requirement to do a read-
back after the barrier.
Okay, then this needs to be done in a way that won't slow down other
architectures with an unnecessary memory barrier.  And there needs to
be a comment in the code explaining that the new mb() instruction isn't
being used as a memory barrier but rather to expedite writeback of the
L2 cache.

This certainly is starting to sound like something that needs to be 
addressed in the arch-specific #include files...
quoted
Is this extra L2-cache "poke" needed for proper ordering, or is it 
needed merely to flush the write out to memory in a timely manner?
Both, though primerily it's about ensuring correct ordering.  A side
effect of it is that it will flush all pending writes in L2 before
completing.

From the theoretical viewpoint, I think I'm right to say that mb()
doesn't need to provide that level of ordering as its supposed to be
an inter-CPU barrier - which probably means we need to invent a new
barrier to deal with DMA memory ordering.  However, given the
difficulty of getting the existing barriers placed correctly, I don't
think inventing new barriers is a very good idea.

What we can do is view devices which perform DMA as being strongly
ordered with respect to their memory accesses - iow, they have an
implicit memory barrier before and after their accesses to memory.
This would make the CPUs use of mb() have a conceptual pairing with
the DMA agents.
Yes, that's the model I have been using all along.  After all, if a DMA 
master carries out its memory accesses in some random order then it's 
impossible for the CPU to make any guarantees.

Alan Stern
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help