Re: [PATCH v3 4/4] arm64: mte: Optimize mte_assign_mem_tag_range()
From: Vincenzo Frascino <vincenzo.frascino@arm.com>
Date: 2021-01-16 17:23:37
Also in:
lkml
Hi Mark, On 1/15/21 3:45 PM, Mark Rutland wrote:
On Fri, Jan 15, 2021 at 12:00:43PM +0000, Vincenzo Frascino wrote:quoted
mte_assign_mem_tag_range() is called on production KASAN HW hot paths. It makes sense to optimize it in an attempt to reduce the overhead. Optimize mte_assign_mem_tag_range() based on the indications provided at [1].... what exactly is the optimization? I /think/ you're just trying to have it inlined, but you should mention that explicitly.
Good point, I will change it in the next version. I used "Optimize" as a continuation of the topic in the previous thread but you are right it is not immediately obvious.
quoted
[1] https://lore.kernel.org/r/CAAeHK+wCO+J7D1_T89DG+jJrPLk3X9RsGFKxJGd0ZcUFjQT-9Q@mail.gmail.com/ (local) Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> --- arch/arm64/include/asm/mte.h | 26 +++++++++++++++++++++++++- arch/arm64/lib/mte.S | 15 --------------- 2 files changed, 25 insertions(+), 16 deletions(-)diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h index 1a715963d909..9730f2b07b79 100644 --- a/arch/arm64/include/asm/mte.h +++ b/arch/arm64/include/asm/mte.h@@ -49,7 +49,31 @@ long get_mte_ctrl(struct task_struct *task); int mte_ptrace_copy_tags(struct task_struct *child, long request, unsigned long addr, unsigned long data); -void mte_assign_mem_tag_range(void *addr, size_t size); +static inline void mte_assign_mem_tag_range(void *addr, size_t size) +{ + u64 _addr = (u64)addr; + u64 _end = _addr + size; + + /* + * This function must be invoked from an MTE enabled context. + * + * Note: The address must be non-NULL and MTE_GRANULE_SIZE aligned and + * size must be non-zero and MTE_GRANULE_SIZE aligned. + */ + do { + /* + * 'asm volatile' is required to prevent the compiler to move + * the statement outside of the loop. + */ + asm volatile(__MTE_PREAMBLE "stg %0, [%0]" + : + : "r" (_addr) + : "memory"); + + _addr += MTE_GRANULE_SIZE; + } while (_addr < _end);Is there any chance that this can be used for the last bytes of the virtual address space? This might need to change to `_addr == _end` if that is possible, otherwise it'll terminate early in that case.
Theoretically it is a possibility. I will change the condition and add a note for that.
quoted
+}What does the code generation look like for this, relative to the assembly version?
The assembly looks like this: 390: 8b000022 add x2, x1, x0 394: aa0003e1 mov x1, x0 398: d9200821 stg x1, [x1] 39c: 91004021 add x1, x1, #0x10 3a0: eb01005f cmp x2, x1 3a4: 54ffffa8 b.hi 398 <mte_set_mem_tag_range+0x48> You can see the handcrafted one below.
Thanks, Mark.quoted
+ #else /* CONFIG_ARM64_MTE */diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S index 9e1a12e10053..a0a650451510 100644 --- a/arch/arm64/lib/mte.S +++ b/arch/arm64/lib/mte.S@@ -150,18 +150,3 @@ SYM_FUNC_START(mte_restore_page_tags) ret SYM_FUNC_END(mte_restore_page_tags) -/* - * Assign allocation tags for a region of memory based on the pointer tag - * x0 - source pointer - * x1 - size - * - * Note: The address must be non-NULL and MTE_GRANULE_SIZE aligned and - * size must be non-zero and MTE_GRANULE_SIZE aligned. - */ -SYM_FUNC_START(mte_assign_mem_tag_range) -1: stg x0, [x0] - add x0, x0, #MTE_GRANULE_SIZE - subs x1, x1, #MTE_GRANULE_SIZE - b.gt 1b - ret -SYM_FUNC_END(mte_assign_mem_tag_range)-- 2.30.0
-- Regards, Vincenzo _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel