Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags()

[PATCH 00/31] numa/core patches · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 06/31] mm: Only flush the TLB when clearing an accessible pte · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 07/31] sched, numa, mm, s390/thp: Implement pmd_pgprot() for s390 · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 07/31] sched, numa, mm, s390/thp: Implement pmd_pgprot() for s390 · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 02/31] sched, numa, mm: Describe the NUMA scheduling problem formally · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 02/31] sched, numa, mm: Describe the NUMA scheduling problem formally · Mel Gorman <mgorman@suse.de> · 2012-11-01
Re: [PATCH 02/31] sched, numa, mm: Describe the NUMA scheduling problem formally · Rik van Riel <hidden> · 2012-11-01
[PATCH 04/31] x86/mm: Introduce pte_accessible() · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 04/31] x86/mm: Introduce pte_accessible() · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-25
[PATCH 04/31, v2] x86/mm: Introduce pte_accessible() · Ingo Molnar <mingo@kernel.org> · 2012-10-26
Re: [PATCH 04/31] x86/mm: Introduce pte_accessible() · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-25
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Rik van Riel <hidden> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Rik van Riel <hidden> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Ingo Molnar <mingo@kernel.org> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Michel Lespinasse <hidden> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Rik van Riel <hidden> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Rik van Riel <hidden> · 2012-10-26
Re: [PATCH 05/31] x86/mm: Reduce tlb flushes from ptep_set_access_flags() · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-26
[PATCH 03/31] mm/thp: Preserve pgprot across huge page split · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 03/31] mm/thp: Preserve pgprot across huge page split · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 09/31] mm/pgprot: Move the pgprot_modify() fallback definition to mm.h · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 01/31] sched, numa, mm: Make find_busiest_queue() a method · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 17/31] mm/migrate: Introduce migrate_misplaced_page() · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 17/31] mm/migrate: Introduce migrate_misplaced_page() · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 25/31] sched, numa, mm/mpol: Add_MPOL_F_HOME · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 30/31] sched, numa, mm: Implement slow start for working set sampling · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 30/31] sched, numa, mm: Implement slow start for working set sampling · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 23/31] sched, numa, mm: Implement home-node awareness · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 23/31] sched, numa, mm: Implement home-node awareness · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 31/31] sched, numa, mm: Add memcg support to do_huge_pmd_numa_page() · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-25
Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Ingo Molnar <mingo@kernel.org> · 2012-10-26
Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Ingo Molnar <mingo@kernel.org> · 2012-10-26
Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Peter Zijlstra <hidden> · 2012-10-26
Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Ingo Molnar <mingo@kernel.org> · 2012-10-26
Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Linus Torvalds <torvalds@linux-foundation.org> · 2012-10-26
Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Rik van Riel <hidden> · 2012-10-30
Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 29/31] sched, numa, mm: Add NUMA_MIGRATION feature flag · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 21/31] sched, numa, mm: Introduce sched_feat_numa() · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 21/31] sched, numa, mm: Introduce sched_feat_numa() · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 20/31] sched, numa, mm/mpol: Make mempolicy home-node aware · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 20/31] sched, numa, mm/mpol: Make mempolicy home-node aware · Mel Gorman <mgorman@suse.de> · 2012-11-01
Re: [PATCH 20/31] sched, numa, mm/mpol: Make mempolicy home-node aware · Don Morris <hidden> · 2012-11-01
[PATCH 18/31] mm/mpol: Use special PROT_NONE to migrate pages · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 22/31] sched, numa, mm: Implement THP migration · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 22/31] sched, numa, mm: Implement THP migration · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 27/31] sched, numa, mm: Add credits for NUMA placement · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 12/31] mm/mpol: Add MPOL_MF_NOOP · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 12/31] mm/mpol: Add MPOL_MF_NOOP · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 28/31] sched, numa, mm: Implement constant, per task Working Set Sampling (WSS) rate · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 28/31] sched, numa, mm: Implement constant, per task Working Set Sampling (WSS) rate · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 16/31] numa, mm: Support NUMA hinting page faults from gup/gup_fast · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 11/31] mm/mpol: Make MPOL_LOCAL a real policy · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 11/31] mm/mpol: Make MPOL_LOCAL a real policy · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 14/31] mm/mpol: Create special PROT_NONE infrastructure · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 14/31] mm/mpol: Create special PROT_NONE infrastructure · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 15/31] mm/mpol: Add MPOL_MF_LAZY · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 15/31] mm/mpol: Add MPOL_MF_LAZY · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 19/31] sched, numa, mm: Introduce tsk_home_node() · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 19/31] sched, numa, mm: Introduce tsk_home_node() · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 13/31] mm/mpol: Check for misplaced page · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 10/31] mm/mpol: Remove NUMA_INTERLEAVE_HIT · Peter Zijlstra <hidden> · 2012-10-25
[PATCH 24/31] sched, numa, mm: Introduce last_nid in the pageframe · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 24/31] sched, numa, mm: Introduce last_nid in the pageframe · Mel Gorman <mgorman@suse.de> · 2012-11-01
[PATCH 08/31] sched, numa, mm, MIPS/thp: Add pmd_pgprot() implementation · Peter Zijlstra <hidden> · 2012-10-25
Re: [PATCH 00/31] numa/core patches · Zhouping Liu <hidden> · 2012-10-26
Re: [PATCH 00/31] numa/core patches · Peter Zijlstra <hidden> · 2012-10-26
Re: [PATCH 00/31] numa/core patches · Ingo Molnar <mingo@kernel.org> · 2012-10-26
Re: [PATCH 00/31] numa/core patches · Zhouping Liu <hidden> · 2012-10-26
Re: [PATCH 00/31] numa/core patches · Zhouping Liu <hidden> · 2012-10-26
Re: [PATCH 00/31] numa/core patches · Ingo Molnar <mingo@kernel.org> · 2012-10-26
Re: [PATCH 00/31] numa/core patches · Johannes Weiner <hannes@cmpxchg.org> · 2012-10-28
Re: [PATCH 00/31] numa/core patches · Zhouping Liu <hidden> · 2012-10-29
[PATCH] sched, numa, mm: Add memcg support to do_huge_pmd_numa_page() · Ingo Molnar <mingo@kernel.org> · 2012-10-29
Re: [PATCH] sched, numa, mm: Add memcg support to do_huge_pmd_numa_page() · Johannes Weiner <hannes@cmpxchg.org> · 2012-10-29
Re: [PATCH] sched, numa, mm: Add memcg support to do_huge_pmd_numa_page() · Zhouping Liu <hidden> · 2012-10-29
Re: [PATCH] sched, numa, mm: Add memcg support to do_huge_pmd_numa_page() · Ingo Molnar <mingo@kernel.org> · 2012-10-29
Re: [PATCH 00/31] numa/core patches · Zhouping Liu <hidden> · 2012-10-30
Re: [PATCH 00/31] numa/core patches · Johannes Weiner <hannes@cmpxchg.org> · 2012-10-31
Re: [PATCH 00/31] numa/core patches · Hugh Dickins <hughd@google.com> · 2012-10-31
Re: [PATCH 00/31] numa/core patches · Zhouping Liu <hidden> · 2012-10-31
Re: [PATCH 00/31] numa/core patches · Hugh Dickins <hughd@google.com> · 2012-10-31
Re: [PATCH 00/31] numa/core patches · Hugh Dickins <hughd@google.com> · 2012-11-01
Re: [PATCH 00/31] numa/core patches · Zhouping Liu <hidden> · 2012-11-02
Re: [PATCH 00/31] numa/core patches · Hugh Dickins <hughd@google.com> · 2012-11-02
Re: [PATCH 00/31] numa/core patches · Mel Gorman <mgorman@suse.de> · 2012-10-30
Re: [PATCH 00/31] numa/core patches · Andrew Morton <akpm@linux-foundation.org> · 2012-10-30
Re: [PATCH 00/31] numa/core patches · Mel Gorman <mgorman@suse.de> · 2012-10-30
Re: [PATCH 00/31] numa/core patches · Alex Shi <hidden> · 2012-11-03
Re: [PATCH 00/31] numa/core patches · Mel Gorman <mgorman@suse.de> · 2012-11-03
Re: [PATCH 00/31] numa/core patches · Alex Shi <hidden> · 2012-11-10
Re: [PATCH 00/31] numa/core patches · Mel Gorman <mgorman@suse.de> · 2012-11-12
Re: [PATCH 00/31] numa/core patches · Rik van Riel <hidden> · 2012-11-09
Re: [PATCH 00/31] numa/core patches · Srikar Dronamraju <hidden> · 2012-11-05

From: Rik van Riel <hidden>
Date: 2012-10-26 17:52:42
Also in: lkml

On 10/26/2012 01:01 PM, Linus Torvalds wrote:

On Fri, Oct 26, 2012 at 5:34 AM, Michel Lespinasse [off-list ref] wrote:

quoted

On Thu, Oct 25, 2012 at 9:23 PM, Linus Torvalds [off-list ref] wrote:

quoted

Yes. It's not architected as far as I know, though. But I agree, it's
possible - even likely - we could avoid TLB flushing entirely on x86.

Actually, it is architected on x86. This was first described in the
intel appnote 317080 "TLBs, Paging-Structure Caches, and Their
Invalidation", last paragraph of section 5.1. Nowadays, the same
contents are buried somewhere in Volume 3 of the architecture manual
(in my copy: 4.10.4.1 Operations that Invalidate TLBs and
Paging-Structure Caches)

Good. I should have known it must be architected, because we've gone
back-and-forth on this in the kernel historically. We used to have
some TLB invalidates in the faulting path because I wasn't sure
whether they were needed or not, but we clearly don't have them any
more (and I suspect coverage was always spotty).

And Intel (and AMD) have been very good at documenting as architected
these kinds of details that people end up relying on even if they
weren't necessarily originally explicitly documented.

quoted

I *suspect* that whole TLB flush just magically became an SMP one
without anybody ever really thinking about it.

I would be very worried about assuming every non-x86 arch has similar
TLB semantics. However, if their fault handlers always invalidate TLB
for pages that get spurious faults, then skipping the remote
invalidation would be fine. (I believe this is what
tlb_fix_spurious_fault() is for ?)

Yes. Of course, there may be some case where we unintentionally don't
necessarily flush a faulting address (on some architecture that needs
it), and then removing the cross-cpu invalidate could expose that
pre-existing bug-let, and cause an infinite loop of page faults due to
a TLB entry that never gets invalidated even if the page tables are
actually up-to-date.

So changing the mm/pgtable-generic.c function sounds like the right
thing to do, but would be a bit more scary.

Changing the x86 version sounds safe, *especially* since you point out
that the "fault-causes-tlb-invalidate" is architected behavior.

So I'd almost be willing to drop the invalidate in just one single
commit, because it really should be safe. The only thing it does is
guarantee that the accessed bit gets updated, and the accessed bit
just isn't that important. If we never flush the TLB on another CPU
that continues to use a TLB entry where the accessed bit is set (even
if it's cleared in the in-memory page tables), the worst that can
happen is that the accessed bit doesn't ever get set even if that CPU
constantly uses the page.

I suspect it would be safe to simply call tlb_fix_spurious_fault()
both on x86 and in the generic version.

If tlb_fix_spurious_fault is broken on some architecture, they
would already be running into issues like "write page fault
loops until the next context switch" :)

Again, this can be different on non-x86 architectures with software
dirty bits, where a stale TLB entry that never gets flushed could
cause infinite TLB faults that never make progress, but that's really
a TLB _walker_ issue, not a generic VM issue.

Would tlb_fix_spurious_fault take care of that on those
architectures?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help