Re: [PATCH v2 1/4] mm: Check for node_online in node_present_pages
From: Michal Hocko <mhocko@suse.com>
Date: 2020-03-18 11:14:47
Also in:
linux-mm
On Wed 18-03-20 16:32:15, Srikar Dronamraju wrote:
* Michal Hocko [off-list ref] [2020-03-18 11:02:56]:quoted
On Wed 18-03-20 12:58:07, Srikar Dronamraju wrote:
[...]
quoted
quoted
-#define node_present_pages(nid) (NODE_DATA(nid)->node_present_pages) -#define node_spanned_pages(nid) (NODE_DATA(nid)->node_spanned_pages) +#define node_present_pages(nid) \ + (node_online(nid) ? NODE_DATA(nid)->node_present_pages : 0) +#define node_spanned_pages(nid) \ + (node_online(nid) ? NODE_DATA(nid)->node_spanned_pages : 0)I believe this is a wrong approach. We really do not want to special case all the places which require NODE_DATA. Can we please go and allocate pgdat for all possible nodes?I can do that but the question I had was should we make this change just for Powerpc or should the change be for other archs.
No, we shouldn't, really. If NODE_DATA is non-null for all possible nodes then this shouldn't be really necessary and arch specific.
NODE_DATA initialization always seems to be in arch specific code. The other archs that are affected seem to be mips, sh and sparc These archs seem to have making an assumption that NODE_DATA has to be local only,
Which is all good and fine for nodes that hold some memory. If those architectures support memory less nodes at all then I do not see any problem to have remote pgdata.
For example on sparc / arch/sparc/mm/init_64.c in allocate_node_data function.
NODE_DATA(nid) = memblock_alloc_node(sizeof(struct pglist_data),
SMP_CACHE_BYTES, nid);
if (!NODE_DATA(nid)) {
prom_printf("Cannot allocate pglist_data for nid[%d]\n", nid);
prom_halt();
}
NODE_DATA(nid)->node_id = nid;This code is not about memroy less nodes, is it? It looks more like a allocation failure panic-like handling because there is not enough memory to hold pgdat. This also strongly suggests that this platform doesn't really expect memory less nodes in the early init path.
So even if I make changes to allocate NODE_DATA from fallback node, I may not be able to test them.
Please try to focus on the architecture you can test for. From the existing reports I have seen this looks mostly to be a problem for x86 and ppc -- Michal Hocko SUSE Labs