Re: [Patch V3 0/9] Enable memoryless node support for x86

[Patch V3 0/9] Enable memoryless node support for x86 · Jiang Liu <hidden> · 2015-08-17
[Patch V3 1/9] x86, NUMA, ACPI: Online node earlier when doing CPU hot-addition · Jiang Liu <hidden> · 2015-08-17
[Patch V3 2/9] kernel/profile.c: Replace cpu_to_mem() with cpu_to_node() · Jiang Liu <hidden> · 2015-08-17
Re: [Patch V3 2/9] kernel/profile.c: Replace cpu_to_mem() with cpu_to_node() · David Rientjes <rientjes@google.com> · 2015-08-18
Re: [Patch V3 2/9] kernel/profile.c: Replace cpu_to_mem() with cpu_to_node() · Jiang Liu <hidden> · 2015-08-19
Re: [Patch V3 2/9] kernel/profile.c: Replace cpu_to_mem() with cpu_to_node() · David Rientjes <rientjes@google.com> · 2015-08-20
Re: [Patch V3 2/9] kernel/profile.c: Replace cpu_to_mem() with cpu_to_node() · Jiang Liu <hidden> · 2015-10-09
[Patch V3 3/9] sgi-xp: Replace cpu_to_node() with cpu_to_mem() to support memoryless node · Jiang Liu <hidden> · 2015-08-17
Re: [Patch V3 3/9] sgi-xp: Replace cpu_to_node() with cpu_to_mem() to support memoryless node · David Rientjes <rientjes@google.com> · 2015-08-18
Re: [Patch V3 3/9] sgi-xp: Replace cpu_to_node() with cpu_to_mem() to support memoryless node · Jiang Liu <hidden> · 2015-08-19
Re: [Patch V3 3/9] sgi-xp: Replace cpu_to_node() with cpu_to_mem() to support memoryless node · David Rientjes <rientjes@google.com> · 2015-08-20
Re: [Patch V3 3/9] sgi-xp: Replace cpu_to_node() with cpu_to_mem() to support memoryless node · Jiang Liu <hidden> · 2015-08-20
Re: [Patch V3 3/9] sgi-xp: Replace cpu_to_node() with cpu_to_mem() to support memoryless node · Jiang Liu <hidden> · 2015-10-09
Re: [Patch V3 3/9] sgi-xp: Replace cpu_to_node() with cpu_to_mem() to support memoryless node · Robin Holt <robinmholt@gmail.com> · 2015-08-19
Re: [Patch V3 3/9] sgi-xp: Replace cpu_to_node() with cpu_to_mem() to support memoryless node · Jiang Liu <hidden> · 2015-08-19
[Patch V3 4/9] openvswitch: Replace cpu_to_node() with cpu_to_mem() to support memoryless node · Jiang Liu <hidden> · 2015-08-17
Re: [Patch V3 4/9] openvswitch: Replace cpu_to_node() with cpu_to_mem() to support memoryless node · Pravin Shelar <hidden> · 2015-08-18
[Patch V3 5/9] i40e: Use numa_mem_id() to better support memoryless node · Jiang Liu <hidden> · 2015-08-17
Re: [Patch V3 5/9] i40e: Use numa_mem_id() to better support memoryless node · David Rientjes <rientjes@google.com> · 2015-08-18
RE: [Intel-wired-lan] [Patch V3 5/9] i40e: Use numa_mem_id() to better support memoryless node · Patil, Kiran <hidden> · 2015-08-19
RE: [Intel-wired-lan] [Patch V3 5/9] i40e: Use numa_mem_id() to better support memoryless node · David Rientjes <rientjes@google.com> · 2015-08-20
Re: [Intel-wired-lan] [Patch V3 5/9] i40e: Use numa_mem_id() to better support memoryless node · Andrew Morton <akpm@linux-foundation.org> · 2015-10-08
Re: [Intel-wired-lan] [Patch V3 5/9] i40e: Use numa_mem_id() to better support memoryless node · Jiang Liu <hidden> · 2015-10-09
Re: [Intel-wired-lan] [Patch V3 5/9] i40e: Use numa_mem_id() to better support memoryless node · Kamezawa Hiroyuki <hidden> · 2015-10-09
Re: [Intel-wired-lan] [Patch V3 5/9] i40e: Use numa_mem_id() to better support memoryless node · Jiang Liu <hidden> · 2015-10-09
[Patch V3 7/9] x86, numa: Kill useless code to improve code readability · Jiang Liu <hidden> · 2015-08-17
[Patch V3 8/9] mm: Update _mem_id_[] for every possible CPU when memory configuration changes · Jiang Liu <hidden> · 2015-08-17
[Patch V3 9/9] mm, x86: Enable memoryless node support to better support CPU/memory hotplug · Jiang Liu <hidden> · 2015-08-17
Re: [Patch V3 9/9] mm, x86: Enable memoryless node support to better support CPU/memory hotplug · Tang Chen <hidden> · 2015-08-18
Re: [Patch V3 9/9] mm, x86: Enable memoryless node support to better support CPU/memory hotplug · Jiang Liu <hidden> · 2015-08-18
Re: [Patch V3 9/9] mm, x86: Enable memoryless node support to better support CPU/memory hotplug · Tang Chen <hidden> · 2015-08-18
Re: [Patch V3 9/9] mm, x86: Enable memoryless node support to better support CPU/memory hotplug · Ingo Molnar <mingo@kernel.org> · 2015-08-18
[Patch V3 6/9] i40evf: Use numa_mem_id() to better support memoryless node · Jiang Liu <hidden> · 2015-08-17
RE: [Intel-wired-lan] [Patch V3 6/9] i40evf: Use numa_mem_id() to better support memoryless node · Patil, Kiran <hidden> · 2015-08-17
Re: [Intel-wired-lan] [Patch V3 6/9] i40evf: Use numa_mem_id() to better support memoryless node · Jeff Kirsher <hidden> · 2015-08-18
Re: [Patch V3 0/9] Enable memoryless node support for x86 · Andrew Morton <akpm@linux-foundation.org> · 2015-08-17
Re: [Patch V3 0/9] Enable memoryless node support for x86 · Tang Chen <hidden> · 2015-08-18
Re: [Patch V3 0/9] Enable memoryless node support for x86 · Jiang Liu <hidden> · 2015-08-19

From: Jiang Liu <hidden>
Date: 2015-08-19 08:09:18
Also in: lkml

On 2015/8/18 18:02, Tang Chen wrote:

On 08/17/2015 11:18 AM, Jiang Liu wrote:

quoted

This is the third version to enable memoryless node support on x86
platforms. The previous version (https://lkml.org/lkml/2014/7/11/75)
blindly replaces numa_node_id()/cpu_to_node() with numa_mem_id()/
cpu_to_mem(). That's not the right solution as pointed out by Tejun
and Peter due to:
1) We shouldn't shift the burden to normal slab users.
2) Details of memoryless node should be hidden in arch and mm code
    as much as possible.

After digging into more code and documentation, we found the rules to
deal with memoryless node should be:
1) Arch code should online corresponding NUMA node before onlining any
    CPU or memory, otherwise it may cause invalid memory access when
    accessing NODE_DATA(nid).
2) For normal memory allocations without __GFP_THISNODE setting in the
    gfp_flags, we should prefer numa_node_id()/cpu_to_node() instead of
    numa_mem_id()/cpu_to_mem() because the latter loses hardware topology
    information as pointed out by Tejun:
       A - B - X - C - D
    Where X is the memless node.  numa_mem_id() on X would return
    either B or C, right?  If B or C can't satisfy the allocation,
    the allocator would fallback to A from B and D for C, both of
    which aren't optimal. It should first fall back to C or B
    respectively, which the allocator can't do anymoe because the
    information is lost when the caller side performs numa_mem_id().

Hi Liu,

BTW, how is this A - B - X - C - D problem solved ?
I don't quite follow this.

I cannot tell the difference between numa_node_id()/cpu_to_node() and
numa_mem_id()/cpu_to_mem() on this point. Even with hardware topology
info, how could it avoid this problem ?

Isn't it still possible falling back to A from B and D for C ?

Hi Chen,
For the imagined topology, A<->B<->X<->C<->D, where A, B, C, D has
memory and X is memoryless.
Possible fallback lists are:
B: [ B, A, C, D]
X: [ B, C, A, D]
C: [ C, D, B, A]

cpu_to_mem(X) will either return B or C. Let's assume it returns B.
Then we will use "B: [ B, A, C, D]" to allocate memory for X, which
is not the optimal fallback list for X. And cpu_to_node(X) returns
X, and "X: [ B, C, A, D]" is the optimal fallback list for X.
Thanks!
Gerry

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help