Thread (8 messages) 8 messages, 4 authors, 2021-01-19

Re: [PATCH 1/2] xhci: make sure TRB is fully written before giving it to the controller

From: Mathias Nyman <hidden>
Date: 2021-01-18 12:08:06
Also in: stable

On 15.1.2021 19.21, Sergei Shtylyov wrote:
On 1/15/21 7:50 PM, David Laight wrote:
quoted
From: Sergei Shtylyov
quoted
Sent: 15 January 2021 16:40

On 1/15/21 7:19 PM, Mathias Nyman wrote:
quoted
Once the command ring doorbell is rung the xHC controller will parse all
command TRBs on the command ring that have the cycle bit set properly.

If the driver just started writing the next command TRB to the ring when
hardware finished the previous TRB, then HW might fetch an incomplete TRB
as long as its cycle bit set correctly.

A command TRB is 16 bytes (128 bits) long.
Driver writes the command TRB in four 32 bit chunks, with the chunk
containing the cycle bit last. This does however not guarantee that
chunks actually get written in that order.

This was detected in stress testing when canceling URBs with several
connected USB devices.
Two consecutive "Set TR Dequeue pointer" commands got queued right
after each other, and the second one was only partially written when
the controller parsed it, causing the dequeue pointer to be set
to bogus values. This was seen as error messages:

"Mismatch between completed Set TR Deq Ptr command & xHCI internal state"

Solution is to add a write memory barrier before writing the cycle bit.

Cc: <redacted>
Tested-by: Ross Zwisler <redacted>
Signed-off-by: Mathias Nyman <redacted>
---
 drivers/usb/host/xhci-ring.c | 2 ++
 1 file changed, 2 insertions(+)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 5677b81c0915..cf0c93a90200 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2931,6 +2931,8 @@ static void queue_trb(struct xhci_hcd *xhci, struct xhci_ring *ring,
 	trb->field[0] = cpu_to_le32(field1);
 	trb->field[1] = cpu_to_le32(field2);
 	trb->field[2] = cpu_to_le32(field3);
+	/* make sure TRB is fully written before giving it to the controller */
+	wmb();
   Have you tried the lighter barrier, dma_wmb()? IIRC, it exists for these exact cases...
True, good point, dma_wmb() should be enough here.
In fact most other wmb()s in xhci could be turned into dma_wmb().

Looks like Greg already picked this so maybe a later patch to usb-next that does this
wmb() -> dma_wmb() optimization where possible.
quoted
Isn't dma_wmb() needed between the last memory write and the io_write to the doorbell?
   No.
Transfer trbs already have a wmb in giveback_first_trb() 
So no need in that case.

For command trbs it's unlikely but not impossible.
The issue we are solving here is xHC controller parsing two commands after a doorbell ring.
First one was the intended, properly written command. Second was a out-of order
partially written command. driver didn't even ring the doorbell for the second command yet.

There are a couple operations between trb last memory write and command doorbell ring.
a wmb() in that place would solve a case where memory write is so out of order and delayed
that xHC controller reads and reacts to the doorbell ring, and reads the command ring
before the memory write to the command ring is done. Unlikely but not impossible.

No such issues seen so far, but maybe a dma_wmb() in xhci_ring_cmd_db() wouldn't hurt.

-Mathias
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help