Thread (6 messages) 6 messages, 4 authors, 2025-06-10

Re: [PATCH v2] SUNRPC: Cleanup/fix initial rq_pages allocation

From: Benjamin Coddington <hidden>
Date: 2025-06-09 23:54:54

On 9 Jun 2025, at 15:25, Chuck Lever wrote:
On 6/9/25 1:21 PM, Benjamin Coddington wrote:
quoted
While investigating some reports of memory-constrained NUMA machines
failing to mount v3 and v4.0 nfs mounts, we found that svc_init_buffer()
was not attempting to retry allocations from the bulk page allocator.
Typically, this results in a single page allocation being returned and
the mount attempt fails with -ENOMEM.  A retry would have allowed the mount
to succeed.

Additionally, it seems that the bulk allocation in svc_init_buffer() is
redundant because svc_alloc_arg() will perform the required allocation and
does the correct thing to retry the allocations.

The call to allocate memory in svc_alloc_arg() drops the preferred node
argument, but I expect we'll still allocate on the preferred node because
the allocation call happens within the svc thread context, which chooses
the node with memory closest to the current thread's execution.
IIUC this assumption might be incorrect. When a @node argument is
passed in, the allocator tries to allocate memory on that node only.
When the non-node API is used, the local node is tried first, but if
that allocation fails, it looks on other nodes for free pages.

This is how svc_alloc_arg has worked for a while. I tried to make this a
node-specific allocation in 5f7fc5d69f6e ("SUNRPC: Resupply rq_pages
from node-local memory"), but that had to be reverted because there
are some cases where the svc_pool's sv_id is not valid to use as the
node identifier.
It might be possible to know whether we're in per-pool mode in
svc_alloc_arg(), I can look into it.  In that case we could just fall back
to the non-node specific bulk allocation.  I'd rather not be inserting
hard-to-find performance regressions.
But otherwise, IMO de-duplicating the code that fills rq_pages seems
like a step forward. I can take v2 and drop the above paragraph if I get
one or more additional R-b's. This is probably nfsd-fixes material.
Roger.
quoted
This patch cleans out the bulk allocation in svc_init_buffer() to allow
svc_alloc_arg() to handle the allocation/retry logic for rq_pages.
Fixes: ed603bcf4fea ("sunrpc: Replace the rq_pages array with
dynamically-allocated memory")
Thanks for the attention, Chuck.

Ben
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help