Thread (12 messages) 12 messages, 5 authors, 2025-10-08

Re: [PATCH net-next v5 2/2] net/smc: handle -ENOMEM from smc_wr_alloc_link_mem gracefully

From: Paolo Abeni <pabeni@redhat.com>
Date: 2025-10-01 07:21:15
Also in: linux-doc, linux-rdma, linux-s390, lkml

On 9/29/25 11:22 AM, Halil Pasic wrote:
quoted hunk ↗ jump to hunk
On Mon, 29 Sep 2025 09:50:52 +0800
Dust Li [off-list ref] wrote:
quoted
quoted
@@ -175,6 +175,8 @@ struct smc_link {
	struct completion	llc_testlink_resp; /* wait for rx of testlink */
	int			llc_testlink_time; /* testlink interval */
	atomic_t		conn_cnt; /* connections on this link */
+	u16			max_send_wr;
+	u16			max_recv_wr;  
Here, you've moved max_send_wr/max_recv_wr from the link group to individual links.
This means we can now have different max_send_wr/max_recv_wr values on two
different links within the same link group.
Only if allocations fail. Please notice that the hunk:
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -810,6 +810,8 @@ int smcr_link_init(struct smc_link_group *lgr, struct smc_link *lnk,
 	lnk->clearing = 0;
 	lnk->path_mtu = lnk->smcibdev->pattr[lnk->ibport - 1].active_mtu;
 	lnk->link_id = smcr_next_link_id(lgr);
+	lnk->max_send_wr = lgr->max_send_wr;
+	lnk->max_recv_wr = lgr->max_recv_wr;
initializes the link values with the values from the lgr which are in
turn picked up form the systctls at lgr creation time. I have made an
effort to keep these values the same for each link, but in case the
allocation fails and we do back off, we can end up with different values
on the links. 

The alternative would be to throw in the towel, and not create
a second link if we can't match what worked for the first one.
quoted
Since in Alibaba we doesn't use multi-link configurations, we haven't tested
this scenario. Have you tested the link-down handling process in a multi-link
setup?
Mahanta was so kind to do most of the testing on this. I don't think
I've tested this myself. @Mahanta: Would you be kind to give this a try
if it wasn't covered in the past? The best way is probably to modify
the code to force such a scenario. I don't think it is easy to somehow
trigger in the wild.

BTW I don't expect any problems. I think at worst the one link would
end up giving worse performance than the other, but I guess that can
happen for other reasons as well (like different HW for the two links).

But I think getting some sort of a query interface which would tell
us how much did we end up with down the road would be a good idea anyway.

And I hope we can switch to vmalloc down the road as well, which would
make back off less likely.
Unfortunately we are closing the net-next PR right now and I would
prefer such testing being reported explicitly. Let's defer this series
to the next cycle: please re-post when net-next will reopen after Oct 12th.

Thanks,

Paolo
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help