Re: [PATCH net-next v5 2/2] net/smc: handle -ENOMEM from smc_wr_alloc_link_mem gracefully
From: Paolo Abeni <pabeni@redhat.com>
Date: 2025-10-01 07:21:15
Also in:
linux-doc, linux-rdma, linux-s390, lkml
On 9/29/25 11:22 AM, Halil Pasic wrote:
quoted hunk ↗ jump to hunk
On Mon, 29 Sep 2025 09:50:52 +0800 Dust Li [off-list ref] wrote:quoted
quoted
@@ -175,6 +175,8 @@ struct smc_link {struct completion llc_testlink_resp; /* wait for rx of testlink */ int llc_testlink_time; /* testlink interval */ atomic_t conn_cnt; /* connections on this link */ + u16 max_send_wr; + u16 max_recv_wr;Here, you've moved max_send_wr/max_recv_wr from the link group to individual links. This means we can now have different max_send_wr/max_recv_wr values on two different links within the same link group.Only if allocations fail. Please notice that the hunk:--- a/net/smc/smc_core.c +++ b/net/smc/smc_core.c@@ -810,6 +810,8 @@ int smcr_link_init(struct smc_link_group *lgr, struct smc_link *lnk, lnk->clearing = 0; lnk->path_mtu = lnk->smcibdev->pattr[lnk->ibport - 1].active_mtu; lnk->link_id = smcr_next_link_id(lgr); + lnk->max_send_wr = lgr->max_send_wr; + lnk->max_recv_wr = lgr->max_recv_wr;initializes the link values with the values from the lgr which are in turn picked up form the systctls at lgr creation time. I have made an effort to keep these values the same for each link, but in case the allocation fails and we do back off, we can end up with different values on the links. The alternative would be to throw in the towel, and not create a second link if we can't match what worked for the first one.quoted
Since in Alibaba we doesn't use multi-link configurations, we haven't tested this scenario. Have you tested the link-down handling process in a multi-link setup?Mahanta was so kind to do most of the testing on this. I don't think I've tested this myself. @Mahanta: Would you be kind to give this a try if it wasn't covered in the past? The best way is probably to modify the code to force such a scenario. I don't think it is easy to somehow trigger in the wild. BTW I don't expect any problems. I think at worst the one link would end up giving worse performance than the other, but I guess that can happen for other reasons as well (like different HW for the two links). But I think getting some sort of a query interface which would tell us how much did we end up with down the road would be a good idea anyway. And I hope we can switch to vmalloc down the road as well, which would make back off less likely.
Unfortunately we are closing the net-next PR right now and I would prefer such testing being reported explicitly. Let's defer this series to the next cycle: please re-post when net-next will reopen after Oct 12th. Thanks, Paolo