Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup
From: Nishanth Aravamudan <hidden>
Date: 2014-02-21 23:56:30
Also in:
linux-mm
On 21.02.2014 [14:42:03 -0800], Andrew Morton wrote:
On Thu, 20 Feb 2014 10:28:47 -0800 Nishanth Aravamudan [off-list ref] wrote:quoted
On 20.02.2014 [10:05:39 -0600], Christoph Lameter wrote:quoted
On Wed, 19 Feb 2014, Nishanth Aravamudan wrote:quoted
We can call local_memory_node() before the zonelists are setup. In that case, first_zones_zonelist() will not set zone and the reference to zone->node will Oops. Catch this case, and, since we presumably running very early, just return that any node will do.Really? Isnt there some way to avoid this call if zonelists are not setup yet?How do I best determine if zonelists aren't setup yet? The call-path in question (after my series is applied) is: arch/powerpc/kernel/setup_64.c::setup_arch -> arch/powerpc/mm/numa.c::do_init_bootmem() -> cpu_numa_callback() -> numa_setup_cpu() -> map_cpu_to_node() -> update_numa_cpu_node() -> set_cpu_numa_mem() and setup_arch() is called before build_all_zonelists(NULL, NULL) in start_kernel(). This seemed like the most reasonable path, as it's used on hotplug as well.But the call to local_memory_node() you added was in start_secondary(), which isn't in that trace.
I added two calls to local_memory_node(), I *think* both are necessary, but am willing to be corrected. One is in map_cpu_to_node() and one is in start_secondary(). The start_secondary() path is fine, AFAICT, as we are up & running at that point. But in [the renamed function] update_numa_cpu_node() which is used by hotplug, we get called from do_init_bootmem(), which is before the zonelists are setup. I think both calls are necessary because I believe the arch_update_cpu_topology() is used for supporting firmware-driven home-noding, which does not invoke start_secondary() again (the processor is already running, we're just updating the topology in that situation). Then again, I could special-case the do_init_bootmem callpath, which is only called at kernel init time?
I do agree that calling local_memory_node() too early then trying to fudge around the consequences seems rather wrong.
If the answer is to simply not call local_memory_node() early, I'll submit a patch to at least add a comment, as there's nothing in the code itself to prevent this from happening and is guaranteed to oops. Thanks, Nish