Re: [PATCH 1/2] SUNRPC: double free xprt_ctxt while still in use
From: Jeff Layton <jlayton@kernel.org>
Date: 2023-05-09 14:03:26
On Tue, 2023-05-09 at 09:41 +1000, NeilBrown wrote:
When an RPC request is deferred, the rq_xprt_ctxt pointer is moved out of the svc_rqst into the svc_deferred_req. When the deferred request is revisited, the pointer is copied into the new svc_rqst - and also remains in the svc_deferred_req. In the (rare?) case that the request is deferred a second time, the old svc_deferred_req is reused - it still has all the correct content. However in that case the rq_xprt_ctxt pointer is NOT cleared so that when xpo_release_xprt is called, the ctxt is freed (UDP) or possible added to a free list (RDMA). When the deferred request is revisited for a second time, it will reference this ctxt which may be invalid, and the free the object a second time which is likely to oops.
I've always found the deferral code to be really hard to follow. The
dearth of comments around the design doesn't help either...
To be clear, when we call into svc_defer, if rq_deferred is already
set, then that means that we're revisiting a request that has already
been deferred at least once?
quoted hunk ↗ jump to hunk
So change svc_defer() to *always* clear rq_xprt_ctxt, and assert that the value is now stored in the svc_deferred_req. Fixes: 773f91b2cf3f ("SUNRPC: Fix NFSD's request deferral on RDMA transports") Signed-off-by: NeilBrown <redacted> --- net/sunrpc/svc_xprt.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index 84e5d7d31481..5fd94f6bdc75 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c@@ -1223,13 +1223,14 @@ static struct cache_deferred_req *svc_defer(struct cache_req *req) dr->daddr = rqstp->rq_daddr; dr->argslen = rqstp->rq_arg.len >> 2; dr->xprt_ctxt = rqstp->rq_xprt_ctxt; - rqstp->rq_xprt_ctxt = NULL; /* back up head to the start of the buffer and copy */ skip = rqstp->rq_arg.len - rqstp->rq_arg.head[0].iov_len; memcpy(dr->args, rqstp->rq_arg.head[0].iov_base - skip, dr->argslen << 2); } + WARN_ON_ONCE(rqstp->rq_xprt_ctxt != dr->xprt_ctxt); + rqstp->rq_xprt_ctxt = NULL; trace_svc_defer(rqstp); svc_xprt_get(rqstp->rq_xprt); dr->xprt = rqstp->rq_xprt;
I think this looks right, assuming my understanding of what rq_deferred == NULL indicates. Reviewed-by: Jeff Layton <jlayton@kernel.org>