[PATCH] usb: ehci: fix update qtd->token in qh_append_tds
From: Greg KH <hidden>
Date: 2011-08-27 16:07:46
Also in:
linux-omap
On Sat, Aug 27, 2011 at 11:33:26PM +0800, Ming Lei wrote:
Hi, On Sat, Aug 27, 2011 at 11:13 PM, Greg KH [off-list ref] wrote:quoted
On Sat, Aug 27, 2011 at 10:48:35PM +0800, ming.lei at canonical.com wrote:quoted
From: Ming Lei <redacted> This patch fixs one performance bug on ARM Cortex A9 dual core platform, which has been reported on quite a few ARM machines(OMAP4, Tegra 2, snowball...), see details from link of https://bugs.launchpad.net/bugs/709245. In fact, one mb() on ARM is enough to flush L2 cache, but 'dummy->hw_token = token;' after mb() is added just for obeying correct mb() usage.Really? ?A mb() should not be flushing any caches, it's just a memory barrier. ?Or is ARM somehow "special" in this way?As Santosh pointed out, mb on ARM will flush L2 write buffer. The description here is wrong.
Then this can't be accepted as-is :)
I think the below should make the writing reach into memory on all
ARCH after ' token = dummy->hw_token;' is executed.
dummy->hw_token = token;
mb()
token = dummy->hw_token;
The above is the idea introduced to fix the problem.Are you sure? Have you read the documentation about memory barriers to confirm this?
quoted
quoted
The patch has been tested ok on OMAP4 panda A1 board, the performance of 'dd' over usb mass storage can be increased from 4~5MB/sec to 14~16MB/sec after applying this patch.That's impressive, but I don't think this is really the proper way to do this...quoted
Signed-off-by: Ming Lei <redacted> --- ?drivers/usb/host/ehci-q.c | ? 14 ++++++++++++++ ?1 files changed, 14 insertions(+), 0 deletions(-)diff --git a/drivers/usb/host/ehci-q.c b/drivers/usb/host/ehci-q.c index 0917e3a..65b5021 100644 --- a/drivers/usb/host/ehci-q.c +++ b/drivers/usb/host/ehci-q.c@@ -1082,6 +1082,20 @@ static struct ehci_qh *qh_append_tds (? ? ? ? ? ? ? ? ? ? ? wmb (); ? ? ? ? ? ? ? ? ? ? ? dummy->hw_token = token; + ? ? ? ? ? ? ? ? ? ? /* The mb() below is added to make sure that + ? ? ? ? ? ? ? ? ? ? ?* 'token' can be writen into qtd, so that ehci + ? ? ? ? ? ? ? ? ? ? ?* HC can see the up-to-date qtd descriptor. On + ? ? ? ? ? ? ? ? ? ? ?* some archs(at least on ARM Cortex A9 dual core), + ? ? ? ? ? ? ? ? ? ? ?* writing into coherenet memory doesn't mean the + ? ? ? ? ? ? ? ? ? ? ?* value written can reach physical memory + ? ? ? ? ? ? ? ? ? ? ?* immediately, and the value may be buffered + ? ? ? ? ? ? ? ? ? ? ?* inside L2 cache. 'dummy->hw_token = token;' + ? ? ? ? ? ? ? ? ? ? ?* after mb() is added for obeying correct mb() + ? ? ? ? ? ? ? ? ? ? ?* usage. + ? ? ? ? ? ? ? ? ? ? ?* */ + ? ? ? ? ? ? ? ? ? ? mb(); + ? ? ? ? ? ? ? ? ? ? token = dummy->hw_token;Your comment does not match the code, so something is wrong here.If you mean "L2 cache flush", I confess to the mistaken description, and will update it later. If you mean others, could you help to point it out?
I mean others, please read the the last 3 lines of the comment and compare that to the code lines you added. greg k-h