Re: [PATCH 1/1] SUNRPC: call_connect_status needs to destroy transport on ETIMEDOUT before retry
From: Trond Myklebust <trondmy@kernel.org>
Date: 2025-08-04 19:21:16
On Mon, 2025-08-04 at 12:08 -0700, Dai Ngo wrote:
quoted hunk ↗ jump to hunk
Currently, when an RPC connection times out during the connect phase, the task is retried by placing it back on the pending queue and waiting again. In some cases, the timeout occurs because TCP is unable to send the SYN packet. This situation most often arises on bare metal systems at boot time, when the NFS mount is attempted while the network link appears to be up but is not yet stable. This patch addresses the issue by updating call_connect_status to destroy the transport on ETIMEDOUT error before retrying the connection. This ensures that subsequent connection attempts use a fresh transport, reducing the likelihood of repeated failures due to lingering network issues. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> --- net/sunrpc/clnt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 21426c3049d3..701b742750c5 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c@@ -2215,6 +2215,7 @@ call_connect_status(struct rpc_task *task)case -EHOSTUNREACH: case -EPIPE: case -EPROTO: + case -ETIMEDOUT: xprt_conditional_disconnect(task->tk_rqstp->rq_xprt, task->tk_rqstp-quoted
rq_connect_cookie);if (RPC_IS_SOFTCONN(task))@@ -2225,7 +2226,6 @@ call_connect_status(struct rpc_task *task)case -EADDRINUSE: case -ENOTCONN: case -EAGAIN: - case -ETIMEDOUT: if (!(task->tk_flags & RPC_TASK_NO_ROUND_ROBIN) && (task->tk_flags & RPC_TASK_MOVEABLE) && test_bit(XPRT_REMOVE, &xprt->state)) {
Why is this needed? The ETIMEDOUT is supposed to be a task level error, not a connection level thing. Oh... Is this because of TLS? If so, then please fix that to use a more appropriate error. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trondmy@kernel.org, trond.myklebust@hammerspace.com