Re: [PATCH v2] SUNRPC: Cleanup/fix initial rq_pages allocation
From: Chuck Lever <hidden>
Date: 2025-06-09 19:26:11
On 6/9/25 1:21 PM, Benjamin Coddington wrote:
While investigating some reports of memory-constrained NUMA machines failing to mount v3 and v4.0 nfs mounts, we found that svc_init_buffer() was not attempting to retry allocations from the bulk page allocator. Typically, this results in a single page allocation being returned and the mount attempt fails with -ENOMEM. A retry would have allowed the mount to succeed. Additionally, it seems that the bulk allocation in svc_init_buffer() is redundant because svc_alloc_arg() will perform the required allocation and does the correct thing to retry the allocations. The call to allocate memory in svc_alloc_arg() drops the preferred node argument, but I expect we'll still allocate on the preferred node because the allocation call happens within the svc thread context, which chooses the node with memory closest to the current thread's execution.
IIUC this assumption might be incorrect. When a @node argument is
passed in, the allocator tries to allocate memory on that node only.
When the non-node API is used, the local node is tried first, but if
that allocation fails, it looks on other nodes for free pages.
This is how svc_alloc_arg has worked for a while. I tried to make this a
node-specific allocation in 5f7fc5d69f6e ("SUNRPC: Resupply rq_pages
from node-local memory"), but that had to be reverted because there
are some cases where the svc_pool's sv_id is not valid to use as the
node identifier.
But otherwise, IMO de-duplicating the code that fills rq_pages seems
like a step forward. I can take v2 and drop the above paragraph if I get
one or more additional R-b's. This is probably nfsd-fixes material.
This patch cleans out the bulk allocation in svc_init_buffer() to allow svc_alloc_arg() to handle the allocation/retry logic for rq_pages.
Fixes: ed603bcf4fea ("sunrpc: Replace the rq_pages array with
dynamically-allocated memory")
quoted hunk ↗ jump to hunk
Signed-off-by: Benjamin Coddington <redacted> -- On v2: - rebased on nfsd-next - keep the rq_pages array allocation in svc_init_buffer(), defer the page allocation to svc_alloc_arg() --- net/sunrpc/svc.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c index 939b6239df8a..ef8a05aac87f 100644 --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c@@ -638,8 +638,6 @@ EXPORT_SYMBOL_GPL(svc_destroy); static bool svc_init_buffer(struct svc_rqst *rqstp, const struct svc_serv *serv, int node) { - unsigned long ret; - rqstp->rq_maxpages = svc_serv_maxpages(serv); /* rq_pages' last entry is NULL for historical reasons. */@@ -649,9 +647,7 @@ svc_init_buffer(struct svc_rqst *rqstp, const struct svc_serv *serv, int node) if (!rqstp->rq_pages) return false; - ret = alloc_pages_bulk_node(GFP_KERNEL, node, rqstp->rq_maxpages, - rqstp->rq_pages); - return ret == rqstp->rq_maxpages; + return true; } /*
-- Chuck Lever