Thread (21 messages) 21 messages, 5 authors, 2021-12-01

Re: [PATCH v7 3/5] mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY

From: Feng Tang <hidden>
Date: 2021-08-09 02:44:38
Also in: linux-api, lkml
Subsystem: hugetlb subsystem, memory management, memory management - memory policy and migration, the rest · Maintainers: Muchun Song, Oscar Salvador, Andrew Morton, David Hildenbrand, Linus Torvalds

Hi Michal,

Thanks for the review and ACKs to 1/5 and 2/5 patches.

On Fri, Aug 06, 2021 at 03:35:48PM +0200, Michal Hocko wrote:
On Tue 03-08-21 13:59:20, Feng Tang wrote:
quoted
From: Ben Widawsky <redacted>

Implement the missing huge page allocation functionality while obeying
the preferred node semantics. This is similar to the implementation
for general page allocation, as it uses a fallback mechanism to try
multiple preferred nodes first, and then all other nodes.

[akpm: fix compling issue when merging with other hugetlb patch]
[Thanks to 0day bot for catching the missing #ifdef CONFIG_NUMA issue]
Link: https://lore.kernel.org/r/20200630212517.308045-12-ben.widawsky@intel.com (local)
Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Ben Widawsky <redacted>
Co-developed-by: Feng Tang <redacted>
Signed-off-by: Feng Tang <redacted>
ifdefery is just ugly as hell. One way to get rid of that would be to
provide a mpol_is_preferred_many() wrapper and hide the CONFIG_NUMA in
mempolicy.h. I haven't checked but this might help to remove some other
ifdefery as well.

I especially dislike the label hidden in the ifdef. You can get rid of
that by checking the page for NULL.
Yes, the 'ifdef's were annoying to me too, and thanks for the suggestions.
Following is the revised patch upon the suggestion.

Thanks,
Feng

-------8<---------------------

From fc30718c40f02ba5ea73456af49173e66b5032c1 Mon Sep 17 00:00:00 2001
From: Ben Widawsky <redacted>
Date: Thu, 5 Aug 2021 23:01:11 -0400
Subject: [PATCH] mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY

Implement the missing huge page allocation functionality while obeying the
preferred node semantics.  This is similar to the implementation for
general page allocation, as it uses a fallback mechanism to try multiple
preferred nodes first, and then all other nodes. 

To avoid adding too many "#ifdef CONFIG_NUMA" check, add a helper function
in mempolicy.h to check whether a mempolicy is MPOL_PREFERRED_MANY.

[akpm: fix compling issue when merging with other hugetlb patch]
[Thanks to 0day bot for catching the !CONFIG_NUMA compiling issue]
[Michal Hocko: suggest to remove the #ifdef CONFIG_NUMA check]
Link: https://lore.kernel.org/r/20200630212517.308045-12-ben.widawsky@intel.com (local)
Link: https://lkml.kernel.org/r/1627970362-61305-4-git-send-email-feng.tang@intel.com
Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Ben Widawsky <redacted>
Co-developed-by: Feng Tang <redacted>
Signed-off-by: Feng Tang <redacted>
--
 include/linux/mempolicy.h | 12 ++++++++++++
 mm/hugetlb.c              | 28 ++++++++++++++++++++++++----
 2 files changed, 36 insertions(+), 4 deletions(-)
diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h
index 0117e1e..60d5e6c 100644
--- a/include/linux/mempolicy.h
+++ b/include/linux/mempolicy.h
@@ -187,6 +187,12 @@ extern void mpol_put_task_policy(struct task_struct *);
 
 extern bool numa_demotion_enabled;
 
+static inline bool mpol_is_preferred_many(struct mempolicy *pol)
+{
+	return  (pol->mode == MPOL_PREFERRED_MANY);
+}
+
+
 #else
 
 struct mempolicy {};
@@ -297,5 +303,11 @@ static inline nodemask_t *policy_nodemask_current(gfp_t gfp)
 }
 
 #define numa_demotion_enabled	false
+
+static inline bool mpol_is_preferred_many(struct mempolicy *pol)
+{
+	return  false;
+}
+
 #endif /* CONFIG_NUMA */
 #endif
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 95714fb..75ea8bc 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1145,7 +1145,7 @@ static struct page *dequeue_huge_page_vma(struct hstate *h,
 				unsigned long address, int avoid_reserve,
 				long chg)
 {
-	struct page *page;
+	struct page *page = NULL;
 	struct mempolicy *mpol;
 	gfp_t gfp_mask;
 	nodemask_t *nodemask;
@@ -1166,7 +1166,17 @@ static struct page *dequeue_huge_page_vma(struct hstate *h,
 
 	gfp_mask = htlb_alloc_mask(h);
 	nid = huge_node(vma, address, gfp_mask, &mpol, &nodemask);
-	page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask);
+
+	if (mpol_is_preferred_many(mpol)) {
+		page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask);
+
+		/* Fallback to all nodes if page==NULL */
+		nodemask = NULL;
+	}
+
+	if (!page)
+		page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask);
+
 	if (page && !avoid_reserve && vma_has_reserves(vma, chg)) {
 		SetHPageRestoreReserve(page);
 		h->resv_huge_pages--;
@@ -2147,9 +2157,19 @@ struct page *alloc_buddy_huge_page_with_mpol(struct hstate *h,
 	nodemask_t *nodemask;
 
 	nid = huge_node(vma, addr, gfp_mask, &mpol, &nodemask);
-	page = alloc_surplus_huge_page(h, gfp_mask, nid, nodemask, false);
-	mpol_cond_put(mpol);
+	if (mpol_is_preferred_many(mpol)) {
+		gfp_t gfp = gfp_mask | __GFP_NOWARN;
 
+		gfp &=  ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL);
+		page = alloc_surplus_huge_page(h, gfp, nid, nodemask, false);
+
+		/* Fallback to all nodes if page==NULL */
+		nodemask = NULL;
+	}
+
+	if (!page)
+		page = alloc_surplus_huge_page(h, gfp_mask, nid, nodemask, false);
+	mpol_cond_put(mpol);
 	return page;
 }
 
-- 
2.7.4


Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help