[PATCH 16/31] numa, mm: Support NUMA hinting page faults from gup/gup_fast

[PATCH 00/31] numa/core patches · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 06/31] mm: Only flush the TLB when clearing an accessible pte · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 07/31] sched, numa, mm, s390/thp: Implement pmd_pgprot() for s390 · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 07/31] sched, numa, mm, s390/thp: Implement pmd_pgprot() for s390 · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 02/31] sched, numa, mm: Describe the NUMA scheduling problem formally · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 02/31] sched, numa, mm: Describe the NUMA scheduling problem formally · Mel Gorman <mgorman@suse.de> · 2012-11-01
Re: [PATCH 02/31] sched, numa, mm: Describe the NUMA scheduling problem formally · Rik van Riel <hidden> · 2012-11-01
[PATCH 04/31] x86/mm: Introduce pte_accessible() · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 04/31] x86/mm: Introduce pte_accessible() · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-25
[PATCH 04/31, v2] x86/mm: Introduce pte_accessible() · Ingo Molnar <mingo@kernel.org> · 2012-10-26
Re: [PATCH 04/31] x86/mm: Introduce pte_accessible() · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-25
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Rik van Riel <hidden> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Rik van Riel <hidden> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Ingo Molnar <mingo@kernel.org> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Michel Lespinasse <hidden> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Rik van Riel <hidden> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Rik van Riel <hidden> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-26
[PATCH 03/31] mm/thp: Preserve pgprot across huge page split · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 03/31] mm/thp: Preserve pgprot across huge page split · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 09/31] mm/pgprot: Move the pgprot_modify() fallback definition to mm.h · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 01/31] sched, numa, mm: Make find_busiest_queue() a method · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 17/31] mm/migrate: Introduce migrate_misplaced_page() · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 17/31] mm/migrate: Introduce migrate_misplaced_page() · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 25/31] sched, numa, mm/mpol: Add_MPOL_F_HOME · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 30/31] sched, numa, mm: Implement slow start for working set sampling · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 30/31] sched, numa, mm: Implement slow start for working set sampling · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 23/31] sched, numa, mm: Implement home-node awareness · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 23/31] sched, numa, mm: Implement home-node awareness · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 31/31] sched, numa, mm: Add memcg support to do_huge_pmd_numa_page() · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-25
Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Ingo Molnar <mingo@kernel.org> · 2012-10-26
Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Ingo Molnar <mingo@kernel.org> · 2012-10-26
Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Peter Zijlstra <hidden> · 2012-10-26
Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Ingo Molnar <mingo@kernel.org> · 2012-10-26
Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-26
Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Rik van Riel <hidden> · 2012-10-30
Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 29/31] sched, numa, mm: Add NUMA_MIGRATION feature flag · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 21/31] sched, numa, mm: Introduce sched_feat_numa() · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 21/31] sched, numa, mm: Introduce sched_feat_numa() · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 20/31] sched, numa, mm/mpol: Make mempolicy home-node aware · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 20/31] sched, numa, mm/mpol: Make mempolicy home-node aware · Mel Gorman <mgorman@suse.de> · 2012-11-01
Re: [PATCH 20/31] sched, numa, mm/mpol: Make mempolicy home-node aware · Don Morris <hidden> · 2012-11-01
[PATCH 18/31] mm/mpol: Use special PROT_NONE to migrate pages · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 22/31] sched, numa, mm: Implement THP migration · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 22/31] sched, numa, mm: Implement THP migration · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 27/31] sched, numa, mm: Add credits for NUMA placement · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 12/31] mm/mpol: Add MPOL_MF_NOOP · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 12/31] mm/mpol: Add MPOL_MF_NOOP · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 28/31] sched, numa, mm: Implement constant, per task Working Set Sampling (WSS) rate · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 28/31] sched, numa, mm: Implement constant, per task Working Set Sampling (WSS) rate · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 16/31] numa, mm: Support NUMA hinting page faults from gup/gup_fast · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 11/31] mm/mpol: Make MPOL_LOCAL a real policy · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 11/31] mm/mpol: Make MPOL_LOCAL a real policy · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 14/31] mm/mpol: Create special PROT_NONE infrastructure · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 14/31] mm/mpol: Create special PROT_NONE infrastructure · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 15/31] mm/mpol: Add MPOL_MF_LAZY · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 15/31] mm/mpol: Add MPOL_MF_LAZY · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 19/31] sched, numa, mm: Introduce tsk_home_node() · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 19/31] sched, numa, mm: Introduce tsk_home_node() · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 13/31] mm/mpol: Check for misplaced page · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 10/31] mm/mpol: Remove NUMA_INTERLEAVE_HIT · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 24/31] sched, numa, mm: Introduce last_nid in the pageframe · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 24/31] sched, numa, mm: Introduce last_nid in the pageframe · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 08/31] sched, numa, mm, MIPS/thp: Add pmd_pgprot() implementation · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 00/31] numa/core patches · Zhouping Liu <hidden> · 2012-10-26
Re: [PATCH 00/31] numa/core patches · Peter Zijlstra <hidden> · 2012-10-26
Re: [PATCH 00/31] numa/core patches · Ingo Molnar <mingo@kernel.org> · 2012-10-26
Re: [PATCH 00/31] numa/core patches · Zhouping Liu <hidden> · 2012-10-26
Re: [PATCH 00/31] numa/core patches · Zhouping Liu <hidden> · 2012-10-26
Re: [PATCH 00/31] numa/core patches · Ingo Molnar <mingo@kernel.org> · 2012-10-26
Re: [PATCH 00/31] numa/core patches · Johannes Weiner <hannes@cmpxchg.org> · 2012-10-28
Re: [PATCH 00/31] numa/core patches · Zhouping Liu <hidden> · 2012-10-29
[PATCH] sched, numa, mm: Add memcg support to do_huge_pmd_numa_page() · Ingo Molnar <mingo@kernel.org> · 2012-10-29
Re: [PATCH] sched, numa, mm: Add memcg support to do_huge_pmd_numa_page() · Johannes Weiner <hannes@cmpxchg.org> · 2012-10-29
Re: [PATCH] sched, numa, mm: Add memcg support to do_huge_pmd_numa_page() · Zhouping Liu <hidden> · 2012-10-29
Re: [PATCH] sched, numa, mm: Add memcg support to do_huge_pmd_numa_page() · Ingo Molnar <mingo@kernel.org> · 2012-10-29
Re: [PATCH 00/31] numa/core patches · Zhouping Liu <hidden> · 2012-10-30
Re: [PATCH 00/31] numa/core patches · Johannes Weiner <hannes@cmpxchg.org> · 2012-10-31
Re: [PATCH 00/31] numa/core patches · Hugh Dickins <hughd@google.com> · 2012-10-31
Re: [PATCH 00/31] numa/core patches · Zhouping Liu <hidden> · 2012-10-31
Re: [PATCH 00/31] numa/core patches · Hugh Dickins <hughd@google.com> · 2012-10-31
Re: [PATCH 00/31] numa/core patches · Hugh Dickins <hughd@google.com> · 2012-11-01
Re: [PATCH 00/31] numa/core patches · Zhouping Liu <hidden> · 2012-11-02
Re: [PATCH 00/31] numa/core patches · Hugh Dickins <hughd@google.com> · 2012-11-02
Re: [PATCH 00/31] numa/core patches · Mel Gorman <mgorman@suse.de> · 2012-10-30
Re: [PATCH 00/31] numa/core patches · Andrew Morton <akpm@linux-foundation.org> · 2012-10-30
Re: [PATCH 00/31] numa/core patches · Mel Gorman <mgorman@suse.de> · 2012-10-30
Re: [PATCH 00/31] numa/core patches · Alex Shi <hidden> · 2012-11-03
Re: [PATCH 00/31] numa/core patches · Mel Gorman <mgorman@suse.de> · 2012-11-03
Re: [PATCH 00/31] numa/core patches · Alex Shi <hidden> · 2012-11-10
Re: [PATCH 00/31] numa/core patches · Mel Gorman <mgorman@suse.de> · 2012-11-12
Re: [PATCH 00/31] numa/core patches · Rik van Riel <hidden> · 2012-11-09
Re: [PATCH 00/31] numa/core patches · Srikar Dronamraju <hidden> · 2012-11-05

STALE4967d

From: Peter Zijlstra <hidden>
Date: 2012-10-25 13:14:31
Also in: lkml

From: Ingo Molnar <mingo@kernel.org>

Introduce FOLL_NUMA to tell follow_page to check
pte/pmd_numa. get_user_pages must use FOLL_NUMA, and it's safe to do
so because it always invokes handle_mm_fault and retries the
follow_page later.

KVM secondary MMU page faults will trigger the NUMA hinting page
faults through gup_fast -> get_user_pages -> follow_page ->
handle_mm_fault.

Other follow_page callers like KSM should not use FOLL_NUMA, or they
would fail to get the pages if they use follow_page instead of
get_user_pages.

[ This patch was picked up from the AutoNUMA tree. ]

Originally-by: Andrea Arcangeli [off-list ref]
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <redacted>
Cc: Andrea Arcangeli <redacted>
Cc: Rik van Riel <redacted>
[ ported to this tree. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/mm.h |    1 +
 mm/memory.c        |   17 +++++++++++++++++
 2 files changed, 18 insertions(+)

Index: tip/include/linux/mm.h
===================================================================

--- tip.orig/include/linux/mm.h
+++ tip/include/linux/mm.h

@@ -1600,6 +1600,7 @@ struct page *follow_page(struct vm_area_
 #define FOLL_MLOCK	0x40	/* mark page as mlocked */
 #define FOLL_SPLIT	0x80	/* don't return transhuge pages, split them */
 #define FOLL_HWPOISON	0x100	/* check page is hwpoisoned */
+#define FOLL_NUMA	0x200	/* force NUMA hinting page fault */
 
 typedef int (*pte_fn_t)(pte_t *pte, pgtable_t token, unsigned long addr,
 			void *data);

Index: tip/mm/memory.c
===================================================================

--- tip.orig/mm/memory.c
+++ tip/mm/memory.c

@@ -1536,6 +1536,8 @@ struct page *follow_page(struct vm_area_
 		page = follow_huge_pmd(mm, address, pmd, flags & FOLL_WRITE);
 		goto out;
 	}
+	if ((flags & FOLL_NUMA) && pmd_numa(vma, *pmd))
+		goto no_page_table;
 	if (pmd_trans_huge(*pmd)) {
 		if (flags & FOLL_SPLIT) {
 			split_huge_page_pmd(mm, pmd);

@@ -1565,6 +1567,8 @@ split_fallthrough:
 	pte = *ptep;
 	if (!pte_present(pte))
 		goto no_page;
+	if ((flags & FOLL_NUMA) && pte_numa(vma, pte))
+		goto no_page;
 	if ((flags & FOLL_WRITE) && !pte_write(pte))
 		goto unlock;

@@ -1716,6 +1720,19 @@ int __get_user_pages(struct task_struct
 			(VM_WRITE | VM_MAYWRITE) : (VM_READ | VM_MAYREAD);
 	vm_flags &= (gup_flags & FOLL_FORCE) ?
 			(VM_MAYREAD | VM_MAYWRITE) : (VM_READ | VM_WRITE);
+
+	/*
+	 * If FOLL_FORCE and FOLL_NUMA are both set, handle_mm_fault
+	 * would be called on PROT_NONE ranges. We must never invoke
+	 * handle_mm_fault on PROT_NONE ranges or the NUMA hinting
+	 * page faults would unprotect the PROT_NONE ranges if
+	 * _PAGE_NUMA and _PAGE_PROTNONE are sharing the same pte/pmd
+	 * bitflag. So to avoid that, don't set FOLL_NUMA if
+	 * FOLL_FORCE is set.
+	 */
+	if (!(gup_flags & FOLL_FORCE))
+		gup_flags |= FOLL_NUMA;
+
 	i = 0;
 
 	do {


--

To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help