Re: [PATCH v2] SUNRPC: Cleanup/fix initial rq_pages allocation
From: Benjamin Coddington <hidden>
Date: 2025-06-09 23:54:54
On 9 Jun 2025, at 15:25, Chuck Lever wrote:
On 6/9/25 1:21 PM, Benjamin Coddington wrote:quoted
While investigating some reports of memory-constrained NUMA machines failing to mount v3 and v4.0 nfs mounts, we found that svc_init_buffer() was not attempting to retry allocations from the bulk page allocator. Typically, this results in a single page allocation being returned and the mount attempt fails with -ENOMEM. A retry would have allowed the mount to succeed. Additionally, it seems that the bulk allocation in svc_init_buffer() is redundant because svc_alloc_arg() will perform the required allocation and does the correct thing to retry the allocations. The call to allocate memory in svc_alloc_arg() drops the preferred node argument, but I expect we'll still allocate on the preferred node because the allocation call happens within the svc thread context, which chooses the node with memory closest to the current thread's execution.IIUC this assumption might be incorrect. When a @node argument is passed in, the allocator tries to allocate memory on that node only. When the non-node API is used, the local node is tried first, but if that allocation fails, it looks on other nodes for free pages. This is how svc_alloc_arg has worked for a while. I tried to make this a node-specific allocation in 5f7fc5d69f6e ("SUNRPC: Resupply rq_pages from node-local memory"), but that had to be reverted because there are some cases where the svc_pool's sv_id is not valid to use as the node identifier.
It might be possible to know whether we're in per-pool mode in svc_alloc_arg(), I can look into it. In that case we could just fall back to the non-node specific bulk allocation. I'd rather not be inserting hard-to-find performance regressions.
But otherwise, IMO de-duplicating the code that fills rq_pages seems like a step forward. I can take v2 and drop the above paragraph if I get one or more additional R-b's. This is probably nfsd-fixes material.
Roger.
quoted
This patch cleans out the bulk allocation in svc_init_buffer() to allow svc_alloc_arg() to handle the allocation/retry logic for rq_pages.Fixes: ed603bcf4fea ("sunrpc: Replace the rq_pages array with dynamically-allocated memory")
Thanks for the attention, Chuck. Ben