Thread (6 messages) 6 messages, 3 authors, 2025-11-17

Re: [PATCH v1] mm: fix MAX_FOLIO_ORDER on powerpc configs with hugetlb

From: Lorenzo Stoakes <hidden>
Date: 2025-11-17 10:17:03
Also in: linux-mm, lkml

On Thu, Nov 13, 2025 at 04:21:41PM +0100, David Hildenbrand (Red Hat) wrote:
On 13.11.25 14:01, Lorenzo Stoakes wrote:
quoted
FYI, trivial to fix but a conflict on mm/Kconfig for mm-new:
Thanks for the review!

Yeah, this fix will have to obviously go in sooner. And it's easy to
resolve.

That's why this patch is already in  mm/mm-hotfixes-unstable.
Ack.
[...]
quoted
On Wed, Nov 12, 2025 at 03:56:32PM +0100, David Hildenbrand (Red Hat) wrote:
quoted
In the past, CONFIG_ARCH_HAS_GIGANTIC_PAGE indicated that we support
runtime allocation of gigantic hugetlb folios. In the meantime it evolved
into a generic way for the architecture to state that it supports
gigantic hugetlb folios.

In commit fae7d834c43c ("mm: add __dump_folio()") we started using
CONFIG_ARCH_HAS_GIGANTIC_PAGE to decide MAX_FOLIO_ORDER: whether we could
Hm strange commit to introduce this :)
The first commit to be confused about what CONFIG_ARCH_HAS_GIGANTIC_PAGE
actually means (obviously hugetlb, ... :) ), and which sizes are possible...
Yeah, sigh, we love to make things confusing :)
[...]
quoted
quoted
To fix it, let's make powerpc select CONFIG_ARCH_HAS_GIGANTIC_PAGE with
hugetlb on powerpc, and increase the maximum folio size with hugetlb to 16
GiB (possible on arm64 and powerpc). Note that on some powerpc
I guess this is due to 64 KiB base page possibilities. Fun :)

Will this cause powerpc to now support gigantic hugetlb pages when it didn't
before?
It's not really related to 64K IIRC, just the way
CONFIG_ARCH_FORCE_MAX_ORDER and other things interact with powerpcs ways of
mapping cont-pmd-like things for hugetlb.
Ah OK, as I was thinking if it's base pages we could just keep order the
same... if it's somehow possible to get higher sizes even with without then
makes sense to specify.

Lord... I wonder if we should have a doc somewhere describing all the ins and
outs of this?

Not that I'm asking my perenially busy co-maintainer to do _even more_ work but
maybe an idea for the future :P
This patch here doesn't change any of that, it just makes us now correctly
detect that gigantic folios are indeed possible.
quoted
quoted
configurations, whether we actually have gigantic pages
depends on the setting of CONFIG_ARCH_FORCE_MAX_ORDER, but there is
nothing really problematic about setting it unconditionally: we just try to
keep the value small so we can better detect problems in __dump_folio()
and inconsistencies around the expected largest folio in the system.

Ideally, we'd have a better way to obtain the maximum hugetlb folio size
and detect ourselves whether we really end up with gigantic folios. Let's
defer bigger changes and fix the warnings first.
Right.
quoted
While at it, handle gigantic DAX folios more clearly: DAX can only
end up creating gigantic folios with HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
Yes, this is... quite something. Config implying gigantic THP possible but
actually only relevant to DAX...
quoted
Add a new Kconfig option HAVE_GIGANTIC_FOLIOS to make both cases
clearer. In particular, worry about ARCH_HAS_GIGANTIC_PAGE only with
HUGETLB_PAGE.
Hm, I see:

config HUGETLB_PAGE
	def_bool HUGETLBFS
	select XARRAY_MULTI


Which means (unless I misunderstand Kconfig, very possible :) that this is
always set if HUGETLBFS is specified.
Yeah, def_bool enforces that both are set.
quoted
Would it be clearer to just check for
CONFIG_HUGETLBFS?
IMHO, MM code should focus on CONFIG_HUGETLB_PAGE (especially when dealing
with the page/folio aspects), not the FS part of it.
Yeah this is another weird fs/mm split for something that really is ultimately
an mm thing...
$ git grep CONFIG_HUGETLB_PAGE | wc -l
45
$ git grep CONFIG_HUGETLBFS | wc -l
7

Unsurprisingly, we are not being completely consistent :)
Well fair enough :)
quoted
quoted
Note: with enabling CONFIG_ARCH_HAS_GIGANTIC_PAGE on powerpc, we will now
also allow for runtime allocations of folios in some more powerpc configs.
Ah OK you're answering the above. I mean I don't think it'll be a problem
either.
quoted
I don't think this is a problem, but if it is we could handle it through
__HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED.

While __dump_page()/__dump_folio was also problematic (not handling dumping
of tail pages of such gigantic folios correctly), it doesn't relevant
critical enough to mark it as a fix.
Small typo 'it doesn't relevant critical enough' -> 'it doesn't seem
critical enough' perhaps? Doesn't really matter, only fixup if respin or
easy for Andrew to fix.
Ah yes, thanks.
quoted
Are you planning to do follow ups then I guess?
As time permits, I think this all needs to be reworked :(
Yup! :)
[...]
quoted
quoted
@@ -137,6 +137,7 @@ config PPC
  	select ARCH_HAS_DMA_OPS			if PPC64
  	select ARCH_HAS_FORTIFY_SOURCE
  	select ARCH_HAS_GCOV_PROFILE_ALL
+	select ARCH_HAS_GIGANTIC_PAGE		if ARCH_SUPPORTS_HUGETLBFS
Given we know the architecture can support it (presumably all powerpc
arches or all that can support hugetlbfs anyway?), this seems reasonable.
powerpc allows for quite some different configs, so I assume there are some
configs that don't allow ARCH_SUPPORTS_HUGETLBFS.
Ah OK.
[...]
quoted
quoted
  /*
   * There is no real limit on the folio size. We limit them to the maximum we
- * currently expect (e.g., hugetlb, dax).
+ * currently expect: with hugetlb, we expect no folios larger than 16 GiB.
Maybe worth saying 'see CONFIG_HAVE_GIGANTIC_FOLIOS definition' or something?
To me that's implied from the initial ifdef. But not strong opinion about
spelling that out.
quoted
quoted
+ */
+#define MAX_FOLIO_ORDER		get_order(SZ_16G)
Hmm, is the base page size somehow runtime adjustable on powerpc? Why isn't
PUD_ORDER good enough here?
We tried P4D_ORDER but even that doesn't work. I think we effectively end up
with cont-pmd/cont-PUD mappings (or even cont-p4d, I am not 100% sure
because the folding code complicates that).
Ah wow, didn't even know such things could be a thing :)
See powerpcs variant of huge_pte_alloc() where we have stuff like

p4d = p4d_offset(pgd_offset(mm, addr), addr);
if (!mm_pud_folded(mm) && sz >= P4D_SIZE)
	return (pte_t *)p4d;

As soon as we go to things like P4D_ORDER we're suddenly in the range of 512
GiB on x86 etc, so that's also not what we want as an easy fix. (and it
didn't work)
Yeah... better to be explicit about the ppc case I think you're right.
quoted
Or does powerpc have some way of getting 16 GiB gigantic pages even with 4
KiB base page size?
IIUC, yes.

Take a look at MMU_PAGE_16G.
Ack yeah, surprising, but these arches can be a whole other world... too used to
basic arm64/x86-64 :)
There is MMU_PAGE_64G already defined, but it's essentially unused for now.
Hmm :)
quoted
quoted
+#else
+/*
+ * Without hugetlb, gigantic folios that are bigger than a single PUD are
+ * currently impossible.
   */
  #define MAX_FOLIO_ORDER		PUD_ORDER
  #endif
diff --git a/mm/Kconfig b/mm/Kconfig
index 0e26f4fc8717b..ca3f146bc7053 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -908,6 +908,13 @@ config PAGE_MAPCOUNT
  config PGTABLE_HAS_HUGE_LEAVES
  	def_bool TRANSPARENT_HUGEPAGE || HUGETLB_PAGE

+#
+# We can end up creating gigantic folio.
+#
+config HAVE_GIGANTIC_FOLIOS
+	def_bool (HUGETLB_PAGE && ARCH_HAS_GIGANTIC_PAGE) || \
+		 (ZONE_DEVICE && HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD)
Maybe worth spelling out in a comment these two cases?
Not sure if the comments wouldn't just explain what we are reading?

"gigantic folios with hugetlb, PUD-sized folios with ZONE_DEVICE"?
Yeah true not vital.
--
Cheers

David
Cheers, Lorenzo
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help