Thread (15 messages) 15 messages, 5 authors, 2020-03-19

Re: [PATCH v2 1/4] mm: Check for node_online in node_present_pages

From: Michal Hocko <mhocko@suse.com>
Date: 2020-03-18 11:14:47
Also in: linux-mm

On Wed 18-03-20 16:32:15, Srikar Dronamraju wrote:
* Michal Hocko [off-list ref] [2020-03-18 11:02:56]:
quoted
On Wed 18-03-20 12:58:07, Srikar Dronamraju wrote:
[...]
quoted
quoted
-#define node_present_pages(nid)	(NODE_DATA(nid)->node_present_pages)
-#define node_spanned_pages(nid)	(NODE_DATA(nid)->node_spanned_pages)
+#define node_present_pages(nid)		\
+	(node_online(nid) ? NODE_DATA(nid)->node_present_pages : 0)
+#define node_spanned_pages(nid)		\
+	(node_online(nid) ? NODE_DATA(nid)->node_spanned_pages : 0)
I believe this is a wrong approach. We really do not want to special
case all the places which require NODE_DATA. Can we please go and
allocate pgdat for all possible nodes?
I can do that but the question I had was should we make this change just for
Powerpc or should the change be for other archs.
No, we shouldn't, really. If NODE_DATA is non-null for all possible
nodes then this shouldn't be really necessary and arch specific.
NODE_DATA initialization always seems to be in arch specific code.

The other archs that are affected seem to be mips, sh and sparc
These archs seem to have making an assumption that NODE_DATA has to be local
only,
Which is all good and fine for nodes that hold some memory. If those
architectures support memory less nodes at all then I do not see any
problem to have remote pgdata.
For example on sparc / arch/sparc/mm/init_64.c in allocate_node_data function.

  NODE_DATA(nid) = memblock_alloc_node(sizeof(struct pglist_data),
                                             SMP_CACHE_BYTES, nid);
        if (!NODE_DATA(nid)) {
                prom_printf("Cannot allocate pglist_data for nid[%d]\n", nid);
                prom_halt();
        }

        NODE_DATA(nid)->node_id = nid;
This code is not about memroy less nodes, is it? It looks more like a
allocation failure panic-like handling because there is not enough
memory to hold pgdat. This also strongly suggests that this platform
doesn't really expect memory less nodes in the early init path.
So even if I make changes to allocate NODE_DATA from fallback node, I may not
be able to test them.
Please try to focus on the architecture you can test for. From the
existing reports I have seen this looks mostly to be a problem for x86
and ppc
-- 
Michal Hocko
SUSE Labs
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help