Re: [PATCH for-next v2] RDMA/core/sa_query: Retry SA queries
From: Haakon Bugge <hidden>
Date: 2021-08-26 15:59:56
Also in:
lkml
On 25 Aug 2021, at 19:49, Jason Gunthorpe [off-list ref] wrote: On Thu, Aug 12, 2021 at 06:12:35PM +0200, Håkon Bugge wrote:quoted
A MAD packet is sent as an unreliable datagram (UD). SA requests are sent as MAD packets. As such, SA requests or responses may be silently dropped. IB Core's MAD layer has a timeout and retry mechanism, which amongst other, is used by RDMA CM. But it is not used by SA queries. The lack of retries of SA queries leads to long specified timeout, and error being returned in case of packet loss. The ULP or user-land process has to perform the retry. Fix this by taking advantage of the MAD layer's retry mechanism. First, a check against a zero timeout is added in rdma_resolve_route(). In send_mad(), we set the MAD layer timeout to one tenth of the specified timeout and the number of retries to 10. The special case when timeout is less than 10 is handled. With this fix: # ucmatose -c 1000 -S 1024 -C 1 runs stable on an Infiniband fabric. Without this fix, we see an intermittent behavior and it errors out with: cmatose: event: RDMA_CM_EVENT_ROUTE_ERROR, error: -110 (110 is ETIMEDOUT) Fixes: f75b7a529494 ("[PATCH] IB: Add automatic retries to MAD layer") Signed-off-by: Håkon Bugge <redacted> --- drivers/infiniband/core/cma.c | 3 +++ drivers/infiniband/core/sa_query.c | 9 ++++++++- 2 files changed, 11 insertions(+), 1 deletion(-)I'm nervous about this, mostly because the mad layer is very complicated, but it does seem aligned with the spec. However, it seems quite wrong that the timeout comes in from outside, the SA timeout should be integral to the SA layer..
They are quite different (timeout in ms): iser: 1000 rtrs: 30000 srp: 1000 nvme: 3000 samba: 5000 p9: 30000 rds: 5000 xprtrdma: 5000 Dividing 30 seconds by ten and get 3, seems OK. But for iser/srp, we get 100ms, which is in the low end for some system I would expect.
Anyhow, applied to for-next
Thanks! Håkon