Re: [PATCH v2] SUNRPC: Fixup v4.1 backchannel request timeouts
From: Benjamin Coddington <hidden>
Date: 2023-12-11 12:10:00
On 9 Dec 2023, at 4:55, Trond Myklebust wrote:
On Fri, 2023-12-08 at 14:19 -0500, Benjamin Coddington wrote:quoted
After commit 59464b262ff5 ("SUNRPC: SOFTCONN tasks should time out when on the sending list"), any 4.1 backchannel tasks placed on the sending queue would immediately return with -ETIMEDOUT since their req timers are zero. We can fix this by keeping a copy of the rpc_clnt's timeout params on the transport and using them to properly setup the timeouts on the v4.1 backchannel tasks' req. Fixes: 59464b262ff5 ("SUNRPC: SOFTCONN tasks should time out when on the sending list") Signed-off-by: Benjamin Coddington <redacted> --- include/linux/sunrpc/xprt.h | 1 + net/sunrpc/clnt.c | 3 +++ net/sunrpc/xprt.c | 23 ++++++++++++++--------- 3 files changed, 18 insertions(+), 9 deletions(-)diff --git a/include/linux/sunrpc/xprt.hb/include/linux/sunrpc/xprt.h index f85d3a0daca2..7565902053f3 100644--- a/include/linux/sunrpc/xprt.h +++ b/include/linux/sunrpc/xprt.h@@ -285,6 +285,7 @@ struct rpc_xprt {* items */ struct list_head bc_pa_list; /* List of preallocated * backchannel rpc_rqst's */ + struct rpc_timeout bc_timeout; /* backchannel timeout params */ #endif /* CONFIG_SUNRPC_BACKCHANNEL */ struct rb_root recv_queue; /* Receive queue */diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index d6805c1268a7..5891757c88b1 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c@@ -279,6 +279,9 @@ static struct rpc_xprt*rpc_clnt_set_transport(struct rpc_clnt *clnt, clnt->cl_autobind = 1; clnt->cl_timeout = timeout; +#if defined(CONFIG_SUNRPC_BACKCHANNEL) + memcpy(&xprt->bc_timeout, timeout, sizeof(struct rpc_timeout)); +#endifHmm... The xprt can and will be shared among a number of rpc_clnt instances. I therefore think we're better off doing this when we're setting up the back channel.
.. and it seems the timeouts could be different for each, so now I think keeping a copy of the last rpc_clnt's timeouts on the xprt is wrong. We could use the current timeouts from the nfs_client side after we figure out which nfs_client that would be in nfs4_callback_compound(). Trouble is if we have to make a reply before getting that far there's still no timeout set, but that's probably a rare case and could use the xprt type defaults.
i.e. probably doing it in nfs41_init_clientid() after we picked up the lease time (but before we mark the client as ready), and then doing it in nfs4_proc_bind_conn_to_session() if ever that gets called. Note that we have to set the bc_timeout on all xprts that could act as back channels, so you might want to use rpc_clnt_iterate_for_each_xprt(). It might also be worth to look at Olga's trunking code, since I suspect we might need to do something there when adding a new xprt to the existing set.
Ben