Thread (50 messages) 50 messages, 2 authors, 2021-05-17

Re: nvme tcp receive errors

From: Keith Busch <kbusch@kernel.org>
Date: 2021-05-03 20:26:11

On Mon, May 03, 2021 at 12:58:23PM -0700, Sagi Grimberg wrote:
quoted
quoted
quoted
The driver tracepoints captured millions of IO's where everything
happened as expected, so I really think something got confused and
mucked with the wrong request. I've added more trace points to increase
visibility because I frankly didn't find how that could happen just from
code inspection. We will also incorporate your patch below for the next
recreate.
Keith, does the issue still happen with eliminating the network send
from .queue_rq() ?
This patch is successful at resolving the observed r2t issues after the
weekend test run, which is much longer than it could have run
previously. I'm happy we're narrowing this down, but I'm not seeing how
this addresses the problem. It looks like the mutex single threads the
critical parts, but maybe I'm missing something. Any ideas?
Not yet, but note that the send part is mutually exclusive but the
receive context is where we handle the r2t, validate length/offset
and (re)queue the request for sending a h2cdata pdu back to the
controller.

The network send was an optimization for latency, and then I modified
the queueing in the driver such that a request would first go to llist
and then the sending context (either io_work or .queue_rq) would reap it
to a local send_list. This helps the driver get better understanding of
what is inflight such that it better set network msg flags for EOR/MORE.

My assumption is that maybe somehow we send the the initial command
pdu to the controller from queue_rq, receive the r2t back before the
.queue_rq context has completed and something may not be coherent.
Interesting. The network traces look correct, so my thoughts jumped to
possibly incorrect usage of PCIe relaxed ordering, but that appears to
be disabled.. I'll keep looking for other possibilities.
Side question, are you running with a fully preemptible kernel? or
less NVMe queues than cpus?
Voluntary preempt. This test is using the kernel config from Ubuntu
20.04.

There are 16 CPUs in this set up with just 7 IO queues.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help