Thread (5 messages) 5 messages, 2 authors, 2023-12-11

Re: [PATCH v2] SUNRPC: Fixup v4.1 backchannel request timeouts

From: Trond Myklebust <hidden>
Date: 2023-12-11 13:54:37

On Mon, 2023-12-11 at 06:53 -0500, Benjamin Coddington wrote:
On 9 Dec 2023, at 4:55, Trond Myklebust wrote:
quoted
On Fri, 2023-12-08 at 14:19 -0500, Benjamin Coddington wrote:
quoted
After commit 59464b262ff5 ("SUNRPC: SOFTCONN tasks should time
out
when on
the sending list"), any 4.1 backchannel tasks placed on the
sending
queue
would immediately return with -ETIMEDOUT since their req timers
are
zero.
We can fix this by keeping a copy of the rpc_clnt's timeout
params on
the
transport and using them to properly setup the timeouts on the
v4.1
backchannel tasks' req.

Fixes: 59464b262ff5 ("SUNRPC: SOFTCONN tasks should time out when
on
the sending list")
Signed-off-by: Benjamin Coddington <redacted>
---
 include/linux/sunrpc/xprt.h |  1 +
 net/sunrpc/clnt.c           |  3 +++
 net/sunrpc/xprt.c           | 23 ++++++++++++++---------
 3 files changed, 18 insertions(+), 9 deletions(-)
diff --git a/include/linux/sunrpc/xprt.h
b/include/linux/sunrpc/xprt.h
index f85d3a0daca2..7565902053f3 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -285,6 +285,7 @@ struct rpc_xprt {
 						 * items */
 	struct list_head	bc_pa_list;	/* List of
preallocated
 						 * backchannel
rpc_rqst's */
+	struct rpc_timeout	bc_timeout;	/* backchannel
timeout params */
 #endif /* CONFIG_SUNRPC_BACKCHANNEL */
 
 	struct rb_root		recv_queue;	/* Receive queue
*/
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index d6805c1268a7..5891757c88b1 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -279,6 +279,9 @@ static struct rpc_xprt
*rpc_clnt_set_transport(struct rpc_clnt *clnt,
 		clnt->cl_autobind = 1;
 
 	clnt->cl_timeout = timeout;
+#if defined(CONFIG_SUNRPC_BACKCHANNEL)
+	memcpy(&xprt->bc_timeout, timeout, sizeof(struct
rpc_timeout));
+#endif
Hmm... The xprt can and will be shared among a number of rpc_clnt
instances. I therefore think we're better off doing this when we're
setting up the back channel.
.. and it seems the timeouts could be different for each, so now I
think
keeping a copy of the last rpc_clnt's timeouts on the xprt is wrong.

We could use the current timeouts from the nfs_client side after we
figure out which nfs_client that would be in
nfs4_callback_compound().
Trouble is if we have to make a reply before getting that far there's
still no timeout set, but that's probably a rare case and could use
the xprt
type defaults.
The suggestion is that we consider the callback channel to be
associated with the 'service rpc_client', i.e. the one that is assigned
to the nfs_client and that is used for lease recovery, etc.

If you set it up after the negotiation of the session, and after
picking up the lease time, then you should have a useful value. There
will be no other client activity anyway until the client is marked as
being ready, so you can shoehorn it in there (in nfs41_init_clientid()
as suggested below) to address the common case.

Then there is the issue of bind_conn_to_session (when we're setting up
the callback channel anew because the server lost it) and the cases
where we're adding new xprts.
quoted
i.e. probably doing it in nfs41_init_clientid() after we picked up
the
lease time (but before we mark the client as ready), and then doing
it
in nfs4_proc_bind_conn_to_session() if ever that gets called.

Note that we have to set the bc_timeout on all xprts that could act
as
back channels, so you might want to use
rpc_clnt_iterate_for_each_xprt().
It might also be worth to look at Olga's trunking code, since I
suspect
we might need to do something there when adding a new xprt to the
existing set.
Ben
-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help