Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
From: Catalin Marinas <catalin.marinas@arm.com>
Date: 2020-05-28 16:35:00
Also in:
linux-arch, linux-mm
On Thu, May 28, 2020 at 12:05:09PM +0100, Szabolcs Nagy wrote:
The 05/28/2020 10:14, Catalin Marinas wrote:quoted
On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote:quoted
On Fri, May 15, 2020 at 10:16 AM Catalin Marinas [off-list ref] wrote:quoted
To enable tagging on a memory range, the user must explicitly opt in via a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new memory type in the AttrIndx field of a pte, simplify the or'ing of these bits over the protection_map[] attributes by making MT_NORMAL index 0.Should the userspace stack always be mapped as if with PROT_MTE if the hardware supports it? Such a change would be invisible to non-MTE aware userspace since it would already need to opt in to tag checking via prctl. This would let userspace avoid a complex stack initialization sequence when running with stack tagging enabled on the main thread.I don't think the stack initialisation is that difficult. On program startup (can be the dynamic loader). Something like (untested): register unsigned long stack asm ("sp"); unsigned long page_sz = sysconf(_SC_PAGESIZE); mprotect((void *)(stack & ~(page_sz - 1)), page_sz, PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN); (the essential part it PROT_GROWSDOWN so that you don't have to specify a stack lower limit)does this work even if the currently mapped stack is more than page_sz? determining the mapped main stack area is i think non-trivial to do in userspace (requires parsing /proc/self/maps or similar).
Because of PROT_GROWSDOWN, the kernel adjusts the start of the range down automatically. It is potentially problematic if the top of the stack is more than a page away and you want the whole stack coloured. I haven't run a test but my reading of the kernel code is that the stack vma would be split in this scenario, so the range beyond sp+page_sz won't have PROT_MTE set. My assumption is that if you do this during program start, the stack is smaller than a page. Alternatively, could we use argv or envp to determine the top of the user stack (the bottom is taken care of by the kernel)?
quoted
I'm fine, however, with enabling PROT_MTE on the main stack based on some ELF note.note that would likely mean an elf note on the dynamic linker (because a dynamic linked executable may not be loaded by the kernel and ctors in loaded libs run before the executable entry code anyway, so the executable alone cannot be in charge of this decision) i.e. one global switch for all dynamic linked binaries.
I guess parsing such note in the kernel is only useful for static binaries.
i think a dynamic linker can map a new stack and switch to it if it needs to control the properties of the stack at runtime (it's wasteful though).
There is already user code to check for HWCAP2_MTE and the prctl(), so adding an mprotect() doesn't look like a significant overhead.
and i think there should be a runtime mechanism for the brk area: it should be possible to request that future brk expansions are mapped as PROT_MTE so an mte aware malloc implementation can use brk. i think this is not important in the initial design, but if a prctl flag can do it that may be useful to add (may be at a later time).
Looking at the kernel code briefly, I think this would work. We do end up with two vmas for the brk, only the expansion having PROT_MTE, and I'd to find a way to store the extra flag. From a coding perspective, it's easier to just set PROT_MTE by default on both brk and initial stack ;) (VM_DATA_DEFAULT_FLAGS).
(and eventually there should be a way to use PROT_MTE on writable global data and appropriate code generation that takes colors into account when globals are accessed, but that requires significant ELF, ld.so and compiler changes, that need not be part of the initial mte design).
The .data section needs to be driven by the ELF information. It's also a file mapping and we don't support PROT_MTE on them even if MAP_PRIVATE. There are complications like DAX where the file you mmap for CoW may be hosted on memory that does not support MTE (copied to RAM on write). Is there a use-case for global data to be tagged? -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel