microblaze HAVE_MEMBLOCK_NODE_MAP dependency (was Re: [PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA)
From: Michal Hocko <mhocko@kernel.org>
Date: 2019-07-31 13:00:45
Also in:
linux-arm-kernel, linux-mm, linux-s390, lkml, sparclinux
Subsystem:
microblaze architecture, the rest · Maintainers:
Michal Simek, Linus Torvalds
On Wed 31-07-19 15:26:32, Mike Rapoport wrote:
On Wed, Jul 31, 2019 at 01:40:16PM +0200, Michal Hocko wrote:quoted
On Wed 31-07-19 14:14:22, Mike Rapoport wrote:quoted
On Wed, Jul 31, 2019 at 10:03:09AM +0200, Michal Hocko wrote:quoted
On Wed 31-07-19 09:24:21, Mike Rapoport wrote:quoted
[ sorry for a late reply too, somehow I missed this thread before ] On Tue, Jul 30, 2019 at 10:14:15AM +0200, Michal Hocko wrote:quoted
[Sorry for a late reply] On Mon 15-07-19 17:55:07, Hoan Tran OS wrote:quoted
Hi, On 7/12/19 10:00 PM, Michal Hocko wrote:[...]quoted
quoted
Hmm, I thought this was selectable. But I am obviously wrong here. Looking more closely, it seems that this is indeed only about __early_pfn_to_nid and as such not something that should add a config symbol. This should have been called out in the changelog though.Yes, do you have any other comments about my patch?Not really. Just make sure to explicitly state that CONFIG_NODES_SPAN_OTHER_NODES is only about __early_pfn_to_nid and that doesn't really deserve it's own config and can be pulled under NUMA.quoted
quoted
Also while at it, does HAVE_MEMBLOCK_NODE_MAP fall into a similar bucket? Do we have any NUMA architecture that doesn't enable it?HAVE_MEMBLOCK_NODE_MAP makes huge difference in node/zone initialization sequence so it's not only about a singe function.The question is whether we want to have this a config option or enable it unconditionally for each NUMA system.We can make it 'default NUMA', but we can't drop it completely because microblaze uses sparse_memory_present_with_active_regions() which is unavailable when HAVE_MEMBLOCK_NODE_MAP=n.I suppose you mean that microblaze is using sparse_memory_present_with_active_regions even without CONFIG_NUMA, right?Yes.quoted
I have to confess I do not understand that code. What is the deal with setting node id there?The sparse_memory_present_with_active_regions() iterates over memblock.memory regions and uses the node id of each region as the parameter to memory_present(). The assumption here is that sometime before each region was assigned a proper non-negative node id. microblaze uses device tree for memory enumeration and the current FDT code does memblock_add() that implicitly sets nid in memblock.memory regions to -1. So in order to have proper node id passed to memory_present() microblaze has to call memblock_set_node() before it can use sparse_memory_present_with_active_regions().
I am sorry, but I still do not follow. Who is consuming that node id information when NUMA=n. In other words why cannot we simply do
diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c
index a015a951c8b7..3a47e8db8d1c 100644
--- a/arch/microblaze/mm/init.c
+++ b/arch/microblaze/mm/init.c@@ -175,14 +175,9 @@ void __init setup_memory(void) start_pfn = memblock_region_memory_base_pfn(reg); end_pfn = memblock_region_memory_end_pfn(reg); - memblock_set_node(start_pfn << PAGE_SHIFT, - (end_pfn - start_pfn) << PAGE_SHIFT, - &memblock.memory, 0); + memory_present(0, start_pfn << PAGE_SHIFT, end_pfn << PAGE_SHIFT); } - /* XXX need to clip this if using highmem? */ - sparse_memory_present_with_active_regions(0); - paging_init(); }
--
Michal Hocko
SUSE Labs