Thread (49 messages) 49 messages, 9 authors, 2020-03-18

Re: [5.6.0-rc2-next-20200218/powerpc] Boot failure on POWER9

From: Michal Hocko <mhocko@kernel.org>
Date: 2020-02-27 12:03:04
Also in: linux-next

On Wed 26-02-20 22:45:52, Vlastimil Babka wrote:
On 2/26/20 7:41 PM, Michal Hocko wrote:
quoted
On Wed 26-02-20 18:25:28, Cristopher Lameter wrote:
quoted
On Mon, 24 Feb 2020, Michal Hocko wrote:
quoted
Hmm, nasty. Is there any reason why kmalloc_node behaves differently
from the page allocator?
The page allocator will do the same thing if you pass GFP_THISNODE and
insist on allocating memory from a node that does not exist.
I do not think that the page allocator would blow up even with
GFP_THISNODE. The allocation would just fail on memory less node.

Besides that kmalloc_node shouldn't really have an implicit GFP_THISNODE
semantic right? At least I do not see anything like that documented
anywhere.
Seems like SLAB at least behaves like the page allocator. See
____cache_alloc_node() where it basically does:

page = cache_grow_begin(cachep, gfp_exact_node(flags), nodeid);
...
if (!page)
	fallback_alloc(cachep, flags)

gfp_exact_node() adds __GFP_THISNODE among other things, so the initial
attempt does try to stick only to the given node. But fallback_alloc()
doesn't. In fact, even if kmalloc_node() was called with __GFP_THISNODE
then it wouldn't work as intended, as fallback_alloc() doesn't get the
nodeid, but instead will use numa_mem_id(). That part could probably be
improved.

SLUB's ___slab_alloc() has for example this:
if (node != NUMA_NO_NODE && !node_present_pages(node))
Hmm, just a quick note. Shouldn't this be node_managed_pages? In most
cases the difference is negligible but I can imagine crazy setups where
all present pages are simply consumed.
    searchnode = node_to_mem_node(node);

That's from Joonsoo's 2014 commit a561ce00b09e ("slub: fall back to
node_to_mem_node() node if allocating on memoryless node"), suggesting
that the scenario in this bug report should work. Perhaps it just got
broken unintentionally later.
A very good reference. Thanks!
And AFAICS the whole path leading to alloc_slab_page() also doesn't add
__GFP_THISNODE, but will keep it if caller passed it, and ultimately it
does:


if (node == NUMA_NO_NODE)
    page = alloc_pages(flags, order);
else
    page = __alloc_pages_node(node, flags, order);

So yeah looks like SLUB's kmalloc_node() is supposed to behave like the
page allocator's __alloc_pages_node() and respect __GFP_THISNODE but not
enforce it by itself. There's probably just some missing data structure
initialization somewhere right now for memoryless nodes.
Thanks for the confirmation!
-- 
Michal Hocko
SUSE Labs
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help