Thread (30 messages) 30 messages, 2 authors, 2024-07-26

Re: [PATCH v2 00/25] mm: introduce numa_memblks

From: Mike Rapoport <rppt@kernel.org>
Date: 2024-07-26 09:40:30
Also in: linux-acpi, linux-arch, linux-arm-kernel, linux-cxl, linux-devicetree, linux-doc, linux-mips, linux-mm, linux-riscv, linux-s390, linux-sh, lkml, loongarch, nvdimm, sparclinux

On Wed, Jul 24, 2024 at 10:48:42PM -0400, Zi Yan wrote:
quoted hunk ↗ jump to hunk
On 24 Jul 2024, at 20:35, Zi Yan wrote:
quoted
On 24 Jul 2024, at 18:44, Zi Yan wrote:
quoted
Hi,

I have tested this series on both x86_64 and arm64. It works fine on x86_64.
All numa=fake= options work as they did before the series.

But I am not able to boot the kernel (no printout at all) on arm64 VM
(Mac mini M1 VMWare). By git bisecting, arch_numa: switch over to numa_memblks
is the first patch causing the boot failure. I see the warning:

WARNING: modpost: vmlinux: section mismatch in reference: numa_add_cpu+0x1c (section: .text) -> early_cpu_to_node (section: .init.text)

I am not sure if it is red herring or not, since changing early_cpu_to_node
to cpu_to_node in numa_add_cpu() from mm/numa_emulation.c did get rid of the
warning, but the system still failed to boot.

Please note that you need binutils 2.40 to build the arm64 kernel, since there
is a bug(https://sourceware.org/bugzilla/show_bug.cgi?id=31924) in 2.42 preventing
arm64 kernel from booting as well.

My config is attached.
I get more info after adding earlycon to the boot option.
pgdat is NULL, causing issues when free_area_init_node() is dereferencing
it at first WARN_ON.

FYI, my build is this series on top of v6.10 instead of the base commit,
where the series applies cleanly on top v6.10.
OK, the issue comes from that my arm64 VM has no ACPI but x86_64 VM has it,
thus on arm64 VM numa_init(arch_acpi_numa_ini) failed in arch_numa_init()
and the code falls back to numa_init(dummy_numa_init). In dummy_numa_init(),
before patch 23 "arch_numa: switch over to numa_memblks", numa_add_memblk()
from drivers/base/arch_numa.c is called on arm64, which unconditionally
set 0 to numa_nodes_parsed. This is missing in the x86 version of
numa_add_memblk(), which is now used by all arch. By adding the patch
below, my arm64 kernel boots in the VM.

diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
index 806550239d08..354f15b8d9b7 100644
--- a/drivers/base/arch_numa.c
+++ b/drivers/base/arch_numa.c
@@ -279,6 +279,7 @@ static int __init dummy_numa_init(void)
                pr_err("NUMA init failed\n");
                return ret;
        }
+       node_set(0, numa_nodes_parsed);

        numa_off = true;
        return 0;

Feel free to add

Tested-by: Zi Yan <ziy@nvidia.com> # for x86_64 and arm64

after you incorporate the fix.
Thanks a lot for testing, debugging and fixing! 
--
Best Regards,
Yan, Zi


-- 
Sincerely yours,
Mike.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help