Re: [PATCH v9 02/14] mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG
From: Barry Song <hidden>
Date: 2022-03-16 22:15:41
Also in:
linux-doc, linux-mm, lkml
On Wed, Mar 9, 2022 at 3:47 PM Yu Zhao [off-list ref] wrote:
Some architectures support the accessed bit in non-leaf PMD entries,
e.g., x86 sets the accessed bit in a non-leaf PMD entry when using it
as part of linear address translation [1]. Page table walkers that
clear the accessed bit may use this capability to reduce their search
space.
Note that:
1. Although an inline function is preferable, this capability is added
as a configuration option for consistency with the existing macros.
2. Due to the little interest in other varieties, this capability was
only tested on Intel and AMD CPUs.
[1]: Intel 64 and IA-32 Architectures Software Developer's Manual
Volume 3 (June 2021), section 4.8
Signed-off-by: Yu Zhao <redacted>
Acked-by: Brian Geffon <redacted>
Acked-by: Jan Alexander Steffens (heftig) <redacted>
Acked-by: Oleksandr Natalenko <redacted>
Acked-by: Steven Barrett <redacted>
Acked-by: Suleiman Souhlal <redacted>
Tested-by: Daniel Byrne <redacted>
Tested-by: Donald Carr <redacted>
Tested-by: Holger Hoffstätte <redacted>
Tested-by: Konstantin Kharlamov <redacted>
Tested-by: Shuang Zhai <redacted>
Tested-by: Sofia Trinh <redacted>
Tested-by: Vaibhav Jain <redacted>
---
Reviewed-by: Barry Song <baohua@kernel.org>
hard to read this patch by itself. but after reading the change in
walk_pmd_range(), it seems this patch becomes quite clear:
walk_pmd_range()
{
...
#ifdef CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG
if (get_cap(LRU_GEN_NONLEAF_YOUNG)) {
if (!pmd_young(val))
continue;
walk_pmd_range_locked(pud, addr, vma, walk, &pos);
}
#endif
...
}
this gives us the chance to skip the scan of all ptes within the
pmd.
so i am not quite sure this should necessarily be a separate
patch, or should be put together with the change in
walk_pmd_range() to make readers understand its purpose.
quoted hunk ↗ jump to hunk
arch/Kconfig | 9 +++++++++ arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 3 ++- arch/x86/mm/pgtable.c | 5 ++++- include/linux/pgtable.h | 4 ++-- 5 files changed, 18 insertions(+), 4 deletions(-)diff --git a/arch/Kconfig b/arch/Kconfig index 678a80713b21..f9c59ecadbbb 100644 --- a/arch/Kconfig +++ b/arch/Kconfig@@ -1322,6 +1322,15 @@ config DYNAMIC_SIGFRAME config HAVE_ARCH_NODE_DEV_GROUP bool +config ARCH_HAS_NONLEAF_PMD_YOUNG + bool + depends on PGTABLE_LEVELS > 2 + help + Architectures that select this option are capable of setting the + accessed bit in non-leaf PMD entries when using them as part of linear + address translations. Page table walkers that clear the accessed bit + may use this capability to reduce their search space. + source "kernel/gcov/Kconfig" source "scripts/gcc-plugins/Kconfig"diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 9f5bd41bf660..e787b7fc75be 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig@@ -85,6 +85,7 @@ config X86 select ARCH_HAS_PMEM_API if X86_64 select ARCH_HAS_PTE_DEVMAP if X86_64 select ARCH_HAS_PTE_SPECIAL + select ARCH_HAS_NONLEAF_PMD_YOUNG select ARCH_HAS_UACCESS_FLUSHCACHE if X86_64 select ARCH_HAS_COPY_MC if X86_64 select ARCH_HAS_SET_MEMORYdiff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 60b6ce45c2e3..f973788f6b21 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h@@ -819,7 +819,8 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) static inline int pmd_bad(pmd_t pmd) { - return (pmd_flags(pmd) & ~_PAGE_USER) != _KERNPG_TABLE; + return (pmd_flags(pmd) & ~(_PAGE_USER | _PAGE_ACCESSED)) != + (_KERNPG_TABLE & ~_PAGE_ACCESSED); } static inline unsigned long pages_to_mb(unsigned long npg)diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 3481b35cb4ec..a224193d84bf 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c@@ -550,7 +550,7 @@ int ptep_test_and_clear_young(struct vm_area_struct *vma, return ret; } -#ifdef CONFIG_TRANSPARENT_HUGEPAGE +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG) int pmdp_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pmd_t *pmdp) {@@ -562,6 +562,9 @@ int pmdp_test_and_clear_young(struct vm_area_struct *vma, return ret; } +#endif + +#ifdef CONFIG_TRANSPARENT_HUGEPAGE int pudp_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pud_t *pudp) {diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 79f64dcff07d..743e7fc4afda 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h@@ -212,7 +212,7 @@ static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, #endif #ifndef __HAVE_ARCH_PMDP_TEST_AND_CLEAR_YOUNG -#ifdef CONFIG_TRANSPARENT_HUGEPAGE +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG) static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp)@@ -233,7 +233,7 @@ static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma, BUILD_BUG(); return 0; } -#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ +#endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG */ #endif #ifndef __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH --2.35.1.616.g0bdcbb4464-goog
Thanks Barry _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel