Thread (38 messages) 38 messages, 10 authors, 2015-10-09

Re: [Patch V3 0/9] Enable memoryless node support for x86

From: Jiang Liu <hidden>
Date: 2015-08-19 08:09:18
Also in: lkml

On 2015/8/18 18:02, Tang Chen wrote:
On 08/17/2015 11:18 AM, Jiang Liu wrote:
quoted
This is the third version to enable memoryless node support on x86
platforms. The previous version (https://lkml.org/lkml/2014/7/11/75)
blindly replaces numa_node_id()/cpu_to_node() with numa_mem_id()/
cpu_to_mem(). That's not the right solution as pointed out by Tejun
and Peter due to:
1) We shouldn't shift the burden to normal slab users.
2) Details of memoryless node should be hidden in arch and mm code
    as much as possible.

After digging into more code and documentation, we found the rules to
deal with memoryless node should be:
1) Arch code should online corresponding NUMA node before onlining any
    CPU or memory, otherwise it may cause invalid memory access when
    accessing NODE_DATA(nid).
2) For normal memory allocations without __GFP_THISNODE setting in the
    gfp_flags, we should prefer numa_node_id()/cpu_to_node() instead of
    numa_mem_id()/cpu_to_mem() because the latter loses hardware topology
    information as pointed out by Tejun:
       A - B - X - C - D
    Where X is the memless node.  numa_mem_id() on X would return
    either B or C, right?  If B or C can't satisfy the allocation,
    the allocator would fallback to A from B and D for C, both of
    which aren't optimal. It should first fall back to C or B
    respectively, which the allocator can't do anymoe because the
    information is lost when the caller side performs numa_mem_id().
Hi Liu,

BTW, how is this A - B - X - C - D problem solved ?
I don't quite follow this.

I cannot tell the difference between numa_node_id()/cpu_to_node() and
numa_mem_id()/cpu_to_mem() on this point. Even with hardware topology
info, how could it avoid this problem ?

Isn't it still possible falling back to A from B and D for C ?
Hi Chen,
For the imagined topology, A<->B<->X<->C<->D, where A, B, C, D has
memory and X is memoryless.
Possible fallback lists are:
B: [ B, A, C, D]
X: [ B, C, A, D]
C: [ C, D, B, A]

cpu_to_mem(X) will either return B or C. Let's assume it returns B.
Then we will use "B: [ B, A, C, D]" to allocate memory for X, which
is not the optimal fallback list for X. And cpu_to_node(X) returns
X, and "X: [ B, C, A, D]" is the optimal fallback list for X.
Thanks!
Gerry

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help