Thread (21 messages) 21 messages, 5 authors, 2025-11-12

Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate

From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Date: 2025-11-12 12:54:07

Christophe Leroy [off-list ref] writes:
Le 10/11/2025 à 12:27, David Hildenbrand (Red Hat) a écrit :
quoted
Thanks for the review!
quoted
So I think what you want instead is:
diff --git a/arch/powerpc/platforms/Kconfig.cputype
b/arch/powerpc/platforms/Kconfig.cputype
index 7b527d18aa5ee..1f5a1e587740c 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -276,6 +276,7 @@ config PPC_E500
          select FSL_EMB_PERFMON
          bool
          select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
+       select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
          select PPC_SMP_MUXED_IPI
          select PPC_DOORBELL
          select PPC_KUEP


quoted
       select ARCH_HAS_KCOV
       select ARCH_HAS_KERNEL_FPU_SUPPORT    if PPC64 && PPC_FPU
       select ARCH_HAS_MEMBARRIER_CALLBACKS
diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/
platforms/Kconfig.cputype
index 7b527d18aa5ee..4c321a8ea8965 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -423,7 +423,6 @@ config PPC_64S_HASH_MMU
   config PPC_RADIX_MMU
       bool "Radix MMU Support"
       depends on PPC_BOOK3S_64
-    select ARCH_HAS_GIGANTIC_PAGE
Should remain I think.
quoted
       default y
       help
         Enable support for the Power ISA 3.0 Radix style MMU. Currently

We also have PPC_8xx do a

     select ARCH_SUPPORTS_HUGETLBFS

And of course !PPC_RADIX_MMU (e.g., PPC_64S_HASH_MMU) through 
PPC_BOOK3S_64.

Are we sure they cannot end up with gigantic folios through hugetlb?
Yes indeed. My PPC_8xx is OK because I set CONFIG_ARCH_FORCE_MAX_ORDER=9 
(largest hugepage is 8M) but I do get the warning with the default value 
which is 8 (with 16k pages).

For PPC_64S_HASH_MMU, max page size is 16M, we get no warning with 
CONFIG_ARCH_FORCE_MAX_ORDER=8 which is the default value but get the 
warning with CONFIG_ARCH_FORCE_MAX_ORDER=7
This made me thinking.. Currently we can also get warning even on
book3s64 when CONFIG_PPC_RADIX_MMU=n is selected because max page size
in case of HASH can be 16G. I guess this was not getting tested in
regular CI because it requires us to disable RADIX config during build.

We will end up in this path on Hash where MAX_PAGE_ORDER is
CONFIG_ARCH_FORCE_MAX_ORDER which is 8, this is because we HAVE
ARCH_HAS_GIGANTIC_PAGE=n in case of only HASH.

From below, MAX_FOLIO_ORDER on !PPC_RADIX_MMU (HASH) becomes 8 i.e... 

    #if !defined(CONFIG_ARCH_HAS_GIGANTIC_PAGE)
    /*
    * We don't expect any folios that exceed buddy sizes (and consequently
    * memory sections).
    */
    #define MAX_FOLIO_ORDER		MAX_PAGE_ORDER

...And thus 
we get similar warning because (order=18 for 16G) > MAX_FOLIO_ORDER(8) in hugetlb_add_hstate().

[    0.000000] Kernel command line: console=hvc0 console=hvc1 systemd.unit=emergency.target root=/dev/vda1 noreboot disable_radix=1 hugepagesz=16M hugepages=1 hugepagesz=16G hugepages=1 default_hugepagesz=16G
<...>
[    0.000000] ------------[ cut here ]------------
[    0.000000] WARNING: CPU: 0 PID: 0 at mm/hugetlb.c:4753 hugetlb_add_hstate+0xf4/0x228
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.18.0-rc3-00138-g1e87cdb8702c #26 NONE
[    0.000000] Hardware name: IBM PowerNV (emulated by qemu) POWER10 0x801200 opal:v7.1-106-g785a5e307 PowerNV
[    0.000000] NIP:  c00000000204ef4c LR: c00000000204f1b0 CTR: c00000000204ee68
[    0.000000] REGS: c000000002857ad0 TRAP: 0700   Not tainted  (6.18.0-rc3-00138-g1e87cdb8702c)
[    0.000000] MSR:  9000000002021033 <SF,HV,VEC,ME,IR,DR,RI,LE>  CR: 28000448  XER: 00000000
[    0.000000] CFAR: c00000000204eed8 IRQMASK: 3
<...>
[    0.000000] NIP [c00000000204ef4c] hugetlb_add_hstate+0xf4/0x228
[    0.000000] LR [c00000000204f1b0] hugepagesz_setup+0x130/0x16c
[    0.000000] Call Trace:
[    0.000000] [c000000002857d70] [c0000000020ee564] hstate_cmdline_buf+0x4/0x800 (unreliable)
[    0.000000] [c000000002857e10] [c00000000204f1b0] hugepagesz_setup+0x130/0x16c
[    0.000000] [c000000002857e80] [c0000000020505a8] hugetlb_bootmem_alloc+0xd8/0x1d0
[    0.000000] [c000000002857ec0] [c000000002046828] mm_core_init+0x2c/0x254
[    0.000000] [c000000002857f30] [c0000000020012ac] start_kernel+0x404/0xae0
[    0.000000] [c000000002857fe0] [c00000000000e934] start_here_common+0x1c/0x20
<...>
[    2.557050] HugeTLB: allocation took 7ms with hugepage_allocation_threads=1
[    2.562263] ------------[ cut here ]------------
[    2.564482] WARNING: CPU: 0 PID: 1 at mm/internal.h:758 gather_bootmem_prealloc_parallel+0x454/0x4d8
[    2.568266] Modules linked in:
[    2.570204] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G        W           6.18.0-rc3-00138-g1e87cdb8702c #26 NONE
[    2.574570] Tainted: [W]=WARN
[    2.576009] Hardware name: IBM PowerNV (emulated by qemu) POWER10 0x801200 opal:v7.1-106-g785a5e307 PowerNV
[    2.579979] NIP:  c00000000204f9b0 LR: c00000000204f870 CTR: c00000000204f55c
[    2.582763] REGS: c000000004a0f5a0 TRAP: 0700   Tainted: G        W            (6.18.0-rc3-00138-g1e87cdb8702c)
[    2.586670] MSR:  9000000002029033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>  CR: 44002288  XER: 20040000
[    2.590234] CFAR: c00000000204f880 IRQMASK: 0
<...>
[    2.616926] NIP [c00000000204f9b0] gather_bootmem_prealloc_parallel+0x454/0x4d8
[    2.619928] LR [c00000000204f870] gather_bootmem_prealloc_parallel+0x314/0x4d8
[    2.622799] Call Trace:
[    2.624068] [c000000004a0f840] [c00000000204f85c] gather_bootmem_prealloc_parallel+0x300/0x4d8 (unreliable)
[    2.627847] [c000000004a0f930] [c000000002041018] padata_do_multithreaded+0x470/0x518
[    2.631141] [c000000004a0fad0] [c00000000204fce8] hugetlb_init+0x2b4/0x904
[    2.633914] [c000000004a0fc10] [c000000000010d74] do_one_initcall+0xac/0x438
[    2.636761] [c000000004a0fcf0] [c000000002001dfc] kernel_init_freeable+0x3cc/0x720
[    2.639764] [c000000004a0fde0] [c000000000011344] kernel_init+0x34/0x260
[    2.642688] [c000000004a0fe50] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c
[    2.646020] ---- interrupt: 0 at 0x0
[    2.647943] Code: eba100d8 ebc100e0 ebe100e8 e9410058 e92d0c70 7d4a4a79 39200000 40820044 382100f0 eaa1ffa8 4e800020 60420000 <0fe00000> 4bfffed0 3ba00000 7ee4bb78
[    2.654240] irq event stamp: 50400
[    2.655991] hardirqs last  enabled at (50399): [<c00000000002ed84>] interrupt_exit_kernel_prepare+0xd8/0x224
[    2.659759] hardirqs last disabled at (50400): [<c00000000002bdb8>] program_check_exception+0x60/0x78
[    2.663293] softirqs last  enabled at (50320): [<c00000000017aa0c>] handle_softirqs+0x5a8/0x5c0
[    2.666819] softirqs last disabled at (50315): [<c0000000000165e4>] do_softirq_own_stack+0x40/0x54
[    2.670569] ---[ end trace 0000000000000000 ]---
[    2.697258] HugeTLB: registered 16.0 MiB page size, pre-allocated 1 pages
[    2.700831] HugeTLB: 0 KiB vmemmap can be freed for a 16.0 MiB page
[    2.703917] HugeTLB: registered 16.0 GiB page size, pre-allocated 1 pages
[    2.707073] HugeTLB: 0 KiB vmemmap can be freed for a 16.0 GiB page


So I guess making PPC select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS is true,
should help us resolve this warning w.r.t order. 
And I guess the runtime allocation of gigantic pages is anyway being controlled
via, __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED

Feel free to correct me here if I missed anything. There seems to be a
lot of history related to hugetlb / gigantic pages.

-ritesh
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help