Re: [PATCH 00/46] Automatic NUMA Balancing V4

[PATCH 00/46] Automatic NUMA Balancing V4 · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 01/46] x86: mm: only do a local tlb flush in ptep_set_access_flags() · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 05/46] mm: Only flush the TLB when clearing an accessible pte · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 08/46] mm: compaction: Move migration fail/success stats to migrate.c · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 10/46] mm: compaction: Add scanned and isolated counters for compaction · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 13/46] mm: numa: Support NUMA hinting page faults from gup/gup_fast · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 12/46] mm: numa: pte_numa() and pmd_numa() · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 18/46] mm: mempolicy: Check for misplaced page · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 17/46] mm: mempolicy: Add MPOL_MF_NOOP · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 21/46] mm: mempolicy: Add MPOL_MF_LAZY · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 23/46] mm: mempolicy: Hide MPOL_NOOP and MPOL_MF_LAZY from userspace for now · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 26/46] sched, numa, mm: Count WS scanning against present PTEs, not virtual memory ranges · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 30/46] mm: numa: Migrate pages handled during a pmd_numa hinting fault · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 31/46] mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 34/46] sched: numa: Slowly increase the scanning period as NUMA faults are handled · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 35/46] mm: numa: Introduce last_nid to the page frame · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 37/46] mm: numa: Add THP migration for the NUMA working set scanning fault case. · Mel Gorman <mgorman@suse.de> · 2012-11-21
Re: [PATCH 37/46] mm: numa: Add THP migration for the NUMA working set scanning fault case. · Mel Gorman <mgorman@suse.de> · 2012-11-21
Re: [PATCH 37/46] mm: numa: Add THP migration for the NUMA working set scanning fault case. · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 38/46] sched: numa: Introduce tsk_home_node() · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 42/46] sched: numa: CPU follows memory · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 44/46] sched: numa: Consider only one CPU per node for CPU-follows-memory · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 46/46] Simple CPU follow · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 45/46] balancenuma: no task swap in finding placement · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 43/46] sched: numa: Rename mempolicy to HOME · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 41/46] sched: numa: Introduce per-mm and per-task structures · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 40/46] sched: numa: Implement home-node awareness · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 39/46] sched: numa: Make find_busiest_queue() a method · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 36/46] mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task<->node relationships · Mel Gorman <mgorman@suse.de> · 2012-11-21
Re: [PATCH 36/46] mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task<->node relationships · Ingo Molnar <mingo@kernel.org> · 2012-11-21
Re: [PATCH 36/46] mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task<->node relationships · Mel Gorman <mgorman@suse.de> · 2012-11-21
Re: [PATCH 36/46] mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task<->node relationships · Mel Gorman <mgorman@suse.de> · 2012-11-21
Re: [PATCH 36/46] mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task<->node relationships · Rik van Riel <hidden> · 2012-11-21
Re: [PATCH 36/46] mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task<->node relationships · Ingo Molnar <mingo@kernel.org> · 2012-11-22
[PATCH 33/46] mm: numa: Rate limit setting of pte_numa if node is saturated · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 32/46] mm: numa: Rate limit the amount of memory that is migrated between nodes · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 29/46] mm: numa: Migrate on reference policy · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 28/46] mm: numa: Add pte updates, hinting and migration stats · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 27/46] mm: sched: numa: Implement slow start for working set sampling · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 25/46] mm: sched: numa: Implement constant, per task Working Set Sampling (WSS) rate · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 24/46] mm: numa: Add fault driven placement and migration · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 22/46] mm: mempolicy: Implement change_prot_numa() in terms of change_protection() · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 20/46] mm: mempolicy: Use _PAGE_NUMA to migrate pages · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 19/46] mm: migrate: Introduce migrate_misplaced_page() · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 16/46] mm: mempolicy: Make MPOL_LOCAL a real policy · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 15/46] mm: numa: Create basic numa page hinting infrastructure · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 14/46] mm: numa: split_huge_page: transfer the NUMA type from the pmd to the pte · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 11/46] mm: numa: define _PAGE_NUMA · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 09/46] mm: migrate: Add a tracepoint for migrate_pages · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 07/46] mm: Optimize the TLB flush of sys_mprotect() and change_protection() users · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 06/46] mm: Count the number of pages affected in change_protection() · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 04/46] x86/mm: Introduce pte_accessible() · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 03/46] mm,generic: only flush the local TLB in ptep_set_access_flags · Mel Gorman <mgorman@suse.de> · 2012-11-21
[PATCH 02/46] x86: mm: drop TLB flush from ptep_set_access_flags · Mel Gorman <mgorman@suse.de> · 2012-11-21
Re: [PATCH 00/46] Automatic NUMA Balancing V4 · Mel Gorman <mgorman@suse.de> · 2012-11-21
Re: [PATCH 00/46] Automatic NUMA Balancing V4 · Ingo Molnar <mingo@kernel.org> · 2012-11-21
Re: [PATCH 00/46] Automatic NUMA Balancing V4 · Mel Gorman <mgorman@suse.de> · 2012-11-21
Re: [PATCH 00/46] Automatic NUMA Balancing V4 · Ingo Molnar <mingo@kernel.org> · 2012-11-21
Re: [PATCH 00/46] Automatic NUMA Balancing V4 · Mel Gorman <mgorman@suse.de> · 2012-11-21
Re: [PATCH 00/46] Automatic NUMA Balancing V4 · Ingo Molnar <mingo@kernel.org> · 2012-11-21
Re: [PATCH 00/46] Automatic NUMA Balancing V4 · Mel Gorman <mgorman@suse.de> · 2012-11-21
Re: [PATCH 00/46] Automatic NUMA Balancing V4 · Ingo Molnar <mingo@kernel.org> · 2012-11-22
Re: [PATCH 00/46] Automatic NUMA Balancing V4 · Mel Gorman <mgorman@suse.de> · 2012-11-22
Re: [PATCH 00/46] Automatic NUMA Balancing V4 · Ingo Molnar <mingo@kernel.org> · 2012-11-22
Re: [PATCH 00/46] Automatic NUMA Balancing V4 · Mel Gorman <mgorman@suse.de> · 2012-11-22
Re: [PATCH 00/46] Automatic NUMA Balancing V4 · Mel Gorman <mgorman@suse.de> · 2012-11-22

From: Mel Gorman <mgorman@suse.de>
Date: 2012-11-21 18:02:09
Also in: lkml

On Wed, Nov 21, 2012 at 06:33:16PM +0100, Ingo Molnar wrote:

* Mel Gorman [off-list ref] wrote:

quoted

On Wed, Nov 21, 2012 at 06:03:06PM +0100, Ingo Molnar wrote:

quoted

* Mel Gorman [off-list ref] wrote:

quoted

On Wed, Nov 21, 2012 at 10:21:06AM +0000, Mel Gorman wrote:

quoted

I am not including a benchmark report in this but will be posting one
shortly in the "Latest numa/core release, v16" thread along with the latest
schednuma figures I have available.

Report is linked here https://lkml.org/lkml/2012/11/21/202

I ended up cancelling the remaining tests and restarted with

1. schednuma + patches posted since so that works out as

Mel, I'd like to ask you to refer to our tree as numa/core or 
'numacore' in the future. Would such a courtesy to use the 
current name of our tree be possible?

Sure, no problem.

Thanks!

I ran a quick test with your 'balancenuma v4' tree and while 
numa02 and numa01-THREAD-ALLOC performance is looking good, 
numa01 performance does not look very good:

                    mainline    numa/core      balancenuma-v4
     numa01:           340.3       139.4          276 secs

97% slower than numa/core.

It would be. numa01 is an adverse workload where all threads are hammering
the same memory.  The two-stage filter in balancenuma restricts the amount
of migration it does so it ends up in a situation where it cannot balance
properly. It'll do some migration if the PTE updates happen fast enough but
that's about it.  It needs a proper policy on top to detect this situation
and interleave the memory between nodes to at least maximise the available
memory bandwidth. This would replace the two-stage filter which is there
to mitigate a ping-pong effect.

I did a quick SPECjbb 32-warehouses run as well:

                                numa/core      balancenuma-v4
      SPECjbb  +THP:               655 k/sec      607 k/sec

Cool. Lets see what we have here. I have some questions;

You say you ran with 32 warehouses. Was this a single run with just 32
warehouses or you did a specjbb run up to 32 warehouses and use the figure
specjbb spits out? If it ran for multiple warehouses, how did each number
of warehouses do? I ask because sometimes we do worse for low numbers
of warehouses and better at high numbers, particularly around where the
workload peaks.

Was this a single JVM configuration?

What is the comparison with a baseline kernel?

You say you ran with balancenuma-v4. Was that the full series including
the broken placement policy or did you test with just patches 1-37 as I
asked in the patch leader?

Here it's 7.9% slower.

And in comparison to a vanilla kernel?

Bear in mind that my objective was to have a foundation that did noticably
better than mainline that a proper placement and scheduling policy could
be built on top of.

Thanks!

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help