Re: [PATCH] svcrdma: disable timeouts on rdma backchannel
From: Chuck Lever <chuck.lever@oracle.com>
Date: 2021-02-24 20:05:51
On Feb 24, 2021, at 3:02 PM, J. Bruce Fields [off-list ref] wrote: On Wed, Feb 24, 2021 at 02:18:18PM +0000, Chuck Lever wrote:quoted
quoted
On Feb 22, 2021, at 6:36 PM, Timo Rothenpieler [off-list ref] wrote: This brings it in line with the regular tcp backchannel, which also has all those timeouts disabled. Prevents the backchannel from timing out, getting some async operations like server side copying getting stuck indefinitely on the client side. Signed-off-by: Timo Rothenpieler <redacted>Thanks for your patch! I've included it in the for-rc branch at git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.gitSo, I'm sure this patch makes sense. But I'm also curious why it's not recovering.
Agreed. This patch is not a substitute for proper callback channel recovery.
What I think should happen: - clp->cl_cb_state should be set to NFSD4_CB_DOWN.
I think it's set to FAULT.
- This should cause the next SEQUENCE reply to have SEQ4_STATUS_CB_PATH_DOWN set. - That should poke the client to recover. (Maybe by sending a BIND_CONN_TO_SESSION call?) I'd be curious whether any of that's actually happening. --b.quoted
quoted
--- Did the same testing with this applied than before, and could not observe it getting stuck, same as with the previous patch, which I removed before testing this one. This obviously still does not fix the issue of it being seemingly unable to reestablish the disconnected backchannel. An event that disconnects the backchannel but leaves the main connection intact seems a pretty rare occurance though, outside of this issue. net/sunrpc/xprtrdma/svc_rdma_backchannel.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c index 63f8be974df2..8186ab6f99f1 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c +++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c@@ -252,9 +252,9 @@ xprt_setup_rdma_bc(struct xprt_create *args)xprt->timeout = &xprt_rdma_bc_timeout; xprt_set_bound(xprt); xprt_set_connected(xprt); - xprt->bind_timeout = RPCRDMA_BIND_TO; - xprt->reestablish_timeout = RPCRDMA_INIT_REEST_TO; - xprt->idle_timeout = RPCRDMA_IDLE_DISC_TO; + xprt->bind_timeout = 0; + xprt->reestablish_timeout = 0; + xprt->idle_timeout = 0; xprt->prot = XPRT_TRANSPORT_BC_RDMA; xprt->ops = &xprt_rdma_bc_procs; -- 2.25.1-- Chuck Lever
-- Chuck Lever