Re: [RFC] mm: alloc_pages_bulk: remove assumption of populating only NULL elements
From: Yunsheng Lin <hidden>
Date: 2025-02-21 09:34:32
Also in:
kvm, linux-btrfs, linux-mm, linux-nfs, linux-xfs, lkml, virtualization
On 2025/2/18 22:17, Chuck Lever wrote:
On 2/18/25 4:16 AM, Yunsheng Lin wrote:quoted
On 2025/2/17 22:20, Chuck Lever wrote:quoted
On 2/17/25 7:31 AM, Yunsheng Lin wrote:quoted
As mentioned in [1], it seems odd to check NULL elements in the middle of page bulk allocating,I think I requested that check to be added to the bulk page allocator. When sending an RPC reply, NFSD might release pages in the middle ofIt seems there is no usage of the page bulk allocation API in fs/nfsd/ or fs/nfs/, which specific fs the above 'NFSD' is referring to?NFSD is in fs/nfsd/, and it is the major consumer of net/sunrpc/svc_xprt.c.quoted
quoted
the rq_pages array, marking each of those array entries with a NULL pointer. We want to ensure that the array is refilled completely in this case.I did some researching, it seems you requested that in [1]? It seems the 'holes are always at the start' for the case in that discussion too, I am not sure if the case is referring to the caller in net/sunrpc/svc_xprt.c? If yes, it seems caller can do a better job of bulk allocating pages into a whole array sequentially without checking NULL elements first before doing the page bulk allocation as something below:+++ b/net/sunrpc/svc_xprt.c@@ -663,9 +663,10 @@ static bool svc_alloc_arg(struct svc_rqst *rqstp) pages = RPCSVC_MAXPAGES; } - for (filled = 0; filled < pages; filled = ret) { - ret = alloc_pages_bulk(GFP_KERNEL, pages, rqstp->rq_pages); - if (ret > filled) + for (filled = 0; filled < pages; filled += ret) { + ret = alloc_pages_bulk(GFP_KERNEL, pages - filled, + rqstp->rq_pages + filled); + if (ret) /* Made progress, don't sleep yet */ continue;@@ -674,7 +675,7 @@ static bool svc_alloc_arg(struct svc_rqst *rqstp) set_current_state(TASK_RUNNING); return false; } - trace_svc_alloc_arg_err(pages, ret); + trace_svc_alloc_arg_err(pages, filled); memalloc_retry_wait(GFP_KERNEL); } rqstp->rq_page_end = &rqstp->rq_pages[pages];1. https://lkml.iu.edu/hypermail/linux/kernel/2103.2/09060.htmlI still don't see what is broken about the current API.
As mentioned in [1], the page bulk alloc API before this patch may have some space for improvement from performance and easy-to-use perspective as the most existing calllers of page bulk alloc API are trying to bulk allocate the page for the whole array sequentially. 1. https://lore.kernel.org/all/c9950a79-7bcb-41c2-a59e-af315dc6d7ff@huawei.com/ (local)
Anyway, any changes in svc_alloc_arg() will need to be run through the upstream NFSD CI suite before they are merged.
Is there any web link pointing to the above NFSD CI suite, so that I can test it if removing assumption of populating only NULL elements is indeed possible? Look more closely, it seems svc_rqst_release_pages()/svc_rdma_save_io_pages() does set rqstp->rq_respages[i] to NULL based on rqstp->rq_next_page, and the original code before using the page bulk alloc API does seem to only allocate page for NULL elements as can see from the below patch: https://lore.kernel.org/all/20210325114228.27719-8-mgorman@techsingularity.net/T/#u (local) The clearing of rqstp->rq_respages[] to NULL does seems sequentially, is it possible to only pass NULL elements in rqstp->rq_respages[] to alloc_pages_bulk() so that bulk alloc API does not have to do the NULL checking and use the array only as output parameter?