Re: [External] [PATCH RFC/RFT v2 4/4] riscv: Stop emitting preventive sfence.vma for new userspace mappings with Svvptc
From: yunhui cui <hidden>
Date: 2024-05-30 09:35:28
Also in:
linux-arm-kernel, linux-mips, linux-mm, linux-riscv, lkml
Hi Alex, On Thu, Feb 1, 2024 at 12:04 AM Alexandre Ghiti [off-list ref] wrote:
quoted hunk ↗ jump to hunk
The preventive sfence.vma were emitted because new mappings must be made visible to the page table walker but Svvptc guarantees that xRET act as a fence, so no need to sfence.vma for the uarchs that implement this extension. This allows to drastically reduce the number of sfence.vma emitted: * Ubuntu boot to login: Before: ~630k sfence.vma After: ~200k sfence.vma * ltp - mmapstress01 Before: ~45k After: ~6.3k * lmbench - lat_pagefault Before: ~665k After: 832 (!) * lmbench - lat_mmap Before: ~546k After: 718 (!) Signed-off-by: Alexandre Ghiti <redacted> --- arch/riscv/include/asm/pgtable.h | 16 +++++++++++++++- arch/riscv/mm/pgtable.c | 13 +++++++++++++ 2 files changed, 28 insertions(+), 1 deletion(-)diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 0c94260b5d0c..50986e4c4601 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h@@ -473,6 +473,9 @@ static inline void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma, unsigned long address, pte_t *ptep, unsigned int nr) { + asm_volatile_goto(ALTERNATIVE("nop", "j %l[svvptc]", 0, RISCV_ISA_EXT_SVVPTC, 1) + : : : : svvptc); + /* * The kernel assumes that TLBs don't cache invalid entries, but * in RISC-V, SFENCE.VMA specifies an ordering constraint, not a@@ -482,12 +485,23 @@ static inline void update_mmu_cache_range(struct vm_fault *vmf, */ while (nr--) local_flush_tlb_page(address + nr * PAGE_SIZE); + +svvptc: + /* + * Svvptc guarantees that xRET act as a fence, so when the uarch does + * not cache invalid entries, we don't have to do anything. + */ + ; }
From the perspective of RISC-V arch, the logic of this patch is reasonable. The code of mm comm submodule may be missing update_mmu_cache_range(), for example: there is no flush TLB in remap_pte_range() after updating pte. I will send a patch to mm/ to fix this problem next. Thanks, Yunhui