Re: [PATCH] nvmet-rdma: Don't use the inline buffer in order to avoid... | linux-rdma

Re: [PATCH] nvmet-rdma: Don't use the inline buffer in order to avoid allocation for small reads

From: Sagi Grimberg <hidden>
Date: 2016-08-02 13:38:58
Also in: linux-nvme

quoted

Under extreme conditions this might cause data corruptions. By doing that
we we repost the buffer and then post this buffer for the device to send.
If we happen to use shared receive queues the device might write to the
buffer before it sends it (there is no ordering between send and recv
queues). Without SRQs we probably won't get that if the host doesn't
mis-behave and send more than we allowed it, but relying on that is not
really a good idea.

Pitty - it seems so wasteful not being able to use these buffers for
anything that isn't an inline write.

Totally agree, I'm open to smart ideas on this...

I fully agree on the SRQ case, but I think we should offer it for the non-SRP case.

As I wrote, even in the non-srq case, if the host is sending a single
write over the negotiated queue size, the data can land in the buffer
that is currently being sent (its a rare race condition, but
theoretically possible). The reason is that we repost the inline data
buffer for receive before we post the send request. We used to have
it the other way around (which eliminates the issue) but we then saw
some latency bubbles due to the HW sending rnr-naks to the host in
the lack of a receive buffer (in iWARP the problem was even worse
because there is no flow-control).

Do you think it's OK to risk data corruption if the host is
misbehaving?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help