[PATCH] usb: ehci: fix update qtd->token in qh_append_tds
From: stern@rowland.harvard.edu (Alan Stern)
Date: 2011-08-29 01:51:10
Also in:
linux-omap
On Mon, 29 Aug 2011, Russell King - ARM Linux wrote:
On Sun, Aug 28, 2011 at 01:00:07PM -0400, Alan Stern wrote:quoted
It won't do that. All it will do is guarantee that the CPU writes out dumy->hw_token before it writes out or reads in any values executed after the mb.You're right from the perspective of how things are defined today. However, that isn't how things work on ARM. With ARMv6 and ARMv7, we have weak memory ordering. This includes so called "DMA coherent" memory. This means that the architecture does not guarantee the order of writes to DMA coherent memory (which is non- cacheable normal memory) without an intervening 'data synchronization barrier' (dsb). Even that may not be sufficient without also poking at the L2 cache controller. We get around some of that by ensuring that our MMIO read/write macros contain the necessary barriers to ensure that DMA memory is up to date before the DMA agent is programmed. However, this doesn't cater for agents which continue to run in the background. These agents will need some kind of barrier to ensure that the write becomes visible - there's no way to get around that. Maybe we need yet another new barrier macro...
Hmmm. Although the semantics of the various mb() macros were originally defined only for inter-CPU synchronization, I believe they are also supposed to work for guaranteeing the order of accesses to DMA-coherent memory. If that's not the case with ARM, something is seriously wrong. (Maybe I'm wrong about this, but if I am then there's currently _no_ way for the kernel to order DMA-coherent accesses on ARM.) You know better than I do what is needed to resolve the ordering issue. However, contrary to what the original patch description said, this isn't entirely a matter of making the write visible to the host controller: No doubt in time the write will eventually become visible anyway. It's a matter of making the write become visible reasonably quickly and in the correct order with respect to other writes. Is this extra L2-cache "poke" needed for proper ordering, or is it needed merely to flush the write out to memory in a timely manner? Alan Stern