Thread (5 messages) 5 messages, 2 authors, 2023-12-11

Re: [PATCH v2] SUNRPC: Fixup v4.1 backchannel request timeouts

From: Benjamin Coddington <hidden>
Date: 2023-12-11 14:22:27

On 11 Dec 2023, at 8:54, Trond Myklebust wrote:
On Mon, 2023-12-11 at 06:53 -0500, Benjamin Coddington wrote:
quoted
On 9 Dec 2023, at 4:55, Trond Myklebust wrote:
quoted
On Fri, 2023-12-08 at 14:19 -0500, Benjamin Coddington wrote:
quoted
After commit 59464b262ff5 ("SUNRPC: SOFTCONN tasks should time
out
when on
the sending list"), any 4.1 backchannel tasks placed on the
sending
queue
would immediately return with -ETIMEDOUT since their req timers
are
zero.
We can fix this by keeping a copy of the rpc_clnt's timeout
params on
the
transport and using them to properly setup the timeouts on the
v4.1
backchannel tasks' req.

Fixes: 59464b262ff5 ("SUNRPC: SOFTCONN tasks should time out when
on
the sending list")
Signed-off-by: Benjamin Coddington <redacted>
---
 include/linux/sunrpc/xprt.h |  1 +
 net/sunrpc/clnt.c           |  3 +++
 net/sunrpc/xprt.c           | 23 ++++++++++++++---------
 3 files changed, 18 insertions(+), 9 deletions(-)
diff --git a/include/linux/sunrpc/xprt.h
b/include/linux/sunrpc/xprt.h
index f85d3a0daca2..7565902053f3 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -285,6 +285,7 @@ struct rpc_xprt {
 						 * items */
 	struct list_head	bc_pa_list;	/* List of
preallocated
 						 * backchannel
rpc_rqst's */
+	struct rpc_timeout	bc_timeout;	/* backchannel
timeout params */
 #endif /* CONFIG_SUNRPC_BACKCHANNEL */
 
 	struct rb_root		recv_queue;	/* Receive queue
*/
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index d6805c1268a7..5891757c88b1 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -279,6 +279,9 @@ static struct rpc_xprt
*rpc_clnt_set_transport(struct rpc_clnt *clnt,
 		clnt->cl_autobind = 1;
 
 	clnt->cl_timeout = timeout;
+#if defined(CONFIG_SUNRPC_BACKCHANNEL)
+	memcpy(&xprt->bc_timeout, timeout, sizeof(struct
rpc_timeout));
+#endif
Hmm... The xprt can and will be shared among a number of rpc_clnt
instances. I therefore think we're better off doing this when we're
setting up the back channel.
.. and it seems the timeouts could be different for each, so now I
think
keeping a copy of the last rpc_clnt's timeouts on the xprt is wrong.

We could use the current timeouts from the nfs_client side after we
figure out which nfs_client that would be in
nfs4_callback_compound().
Trouble is if we have to make a reply before getting that far there's
still no timeout set, but that's probably a rare case and could use
the xprt
type defaults.
The suggestion is that we consider the callback channel to be
associated with the 'service rpc_client', i.e. the one that is assigned
to the nfs_client and that is used for lease recovery, etc.
Right, I'm on the same page - there's just no place to keep the timeout
currently.  We can't just carry it on the xprt since the xprt can be used
for various clients, and there's multiple xprts to handle too.
If you set it up after the negotiation of the session, and after
picking up the lease time, then you should have a useful value. There
will be no other client activity anyway until the client is marked as
being ready, so you can shoehorn it in there (in nfs41_init_clientid()
as suggested below) to address the common case.

Then there is the issue of bind_conn_to_session (when we're setting up
the callback channel anew because the server lost it) and the cases
where we're adding new xprts.
I think all these cases can be avoided by setting the callback responses'
req timeout values during callback processing after we've looked up the
appropriate nfs_client for the response.  Then we can just use the
cl_timeout, no need to touch the xprts.

Otherwise, I'm looking at every xprt having a list of bc_timeouts for every
rpc_client or some mess like that.

I really appreciate your suggestions,
Ben
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help