Thread (1 message) 1 message, 1 author, 2016-03-30

Re: [PATCH] RDS: sync congestion map updating

From: santosh shilimkar <hidden>
Date: 2016-03-30 17:16:06

Possibly related (same subject, not in this thread)

Hi Wengang,

On 3/30/2016 9:19 AM, Leon Romanovsky wrote:
On Wed, Mar 30, 2016 at 05:08:22PM +0800, Wengang Wang wrote:
quoted
Problem is found that some among a lot of parallel RDS communications hang.
In my test ten or so among 33 communications hang. The send requests got
-ENOBUF error meaning the peer socket (port) is congested. But meanwhile,
peer socket (port) is not congested.

The congestion map updating can happen in two paths: one is in rds_recvmsg path
and the other is when it receives packets from the hardware. There is no
synchronization when updating the congestion map. So a bit operation (clearing)
in the rds_recvmsg path can be skipped by another bit operation (setting) in
hardware packet receving path.

Fix is to add a spin lock per congestion map to sync the update on it.
No performance drop found during the test for the fix.
I assume that this change fixed your issue, however it looks suspicious
that performance wasn't change.
First of all thanks for finding the issue and posting patch
for it. I do agree with Leon on performance comment.
We shouldn't need locks for map updates.

Moreover the parallel receive path on which this patch
is based of doesn't exist in upstream code. I have kept
that out so far because of similar issue like one you
encountered.

Anyways lets discuss offline about the fix even for the
downstream kernel. I suspect we can address it without locks.

Reagrds,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help