[PATCH v3 07/21] arm64: move kernel image to base of vmalloc area
From: Ard Biesheuvel <hidden>
Date: 2016-01-13 15:50:27
Also in:
lkml
On 13 January 2016 at 14:51, Mark Rutland [off-list ref] wrote:
On Wed, Jan 13, 2016 at 09:39:41AM +0100, Ard Biesheuvel wrote:quoted
On 12 January 2016 at 19:14, Mark Rutland [off-list ref] wrote:quoted
On Mon, Jan 11, 2016 at 02:19:00PM +0100, Ard Biesheuvel wrote:quoted
This moves the module area to right before the vmalloc area, and moves the kernel image to the base of the vmalloc area. This is an intermediate step towards implementing kASLR, where the kernel image can be located anywhere in the vmalloc area. Signed-off-by: Ard Biesheuvel <redacted> --- arch/arm64/include/asm/kasan.h | 20 ++++--- arch/arm64/include/asm/kernel-pgtable.h | 5 +- arch/arm64/include/asm/memory.h | 18 ++++-- arch/arm64/include/asm/pgtable.h | 7 --- arch/arm64/kernel/setup.c | 12 ++++ arch/arm64/mm/dump.c | 12 ++-- arch/arm64/mm/init.c | 20 +++---- arch/arm64/mm/kasan_init.c | 21 +++++-- arch/arm64/mm/mmu.c | 62 ++++++++++++++------ 9 files changed, 118 insertions(+), 59 deletions(-)diff --git a/arch/arm64/include/asm/kasan.h b/arch/arm64/include/asm/kasan.h index de0d21211c34..2c583dbf4746 100644 --- a/arch/arm64/include/asm/kasan.h +++ b/arch/arm64/include/asm/kasan.h@@ -1,20 +1,16 @@ #ifndef __ASM_KASAN_H #define __ASM_KASAN_H -#ifndef __ASSEMBLY__ - #ifdef CONFIG_KASAN #include <linux/linkage.h> -#include <asm/memory.h> -#include <asm/pgtable-types.h> /* * KASAN_SHADOW_START: beginning of the kernel virtual addresses. * KASAN_SHADOW_END: KASAN_SHADOW_START + 1/8 of kernel virtual addresses. */ -#define KASAN_SHADOW_START (VA_START) -#define KASAN_SHADOW_END (KASAN_SHADOW_START + (1UL << (VA_BITS - 3))) +#define KASAN_SHADOW_START (VA_START) +#define KASAN_SHADOW_END (KASAN_SHADOW_START + (_AC(1, UL) << (VA_BITS - 3))) /* * This value is used to map an address to the corresponding shadow@@ -26,16 +22,22 @@ * should satisfy the following equation: * KASAN_SHADOW_OFFSET = KASAN_SHADOW_END - (1ULL << 61) */ -#define KASAN_SHADOW_OFFSET (KASAN_SHADOW_END - (1ULL << (64 - 3))) +#define KASAN_SHADOW_OFFSET (KASAN_SHADOW_END - (_AC(1, ULL) << (64 - 3))) +I couldn't immediately spot where KASAN_SHADOW_* were used in assembly. I guess there's some other definition built atop of them that I've missed. Where should I be looking?Well, the problem is that KIMAGE_VADDR will be defined in terms of KASAN_SHADOW_END if KASAN is enabled.Ah. I'd somehow managed to overlook that. Thanks for pointing that out!quoted
But since KASAN always uses the first 1/8 of that VA space, I am going to rework this so that the non-KASAN constants never depend on the actual values but only on CONFIG_KASANPersonally I'd prefer that they were obviously defined in terms of each other if possible (as this means that the definitions are obviously consistent by construction). So if it's not too much of a pain to keep them that way it would be nice to do so. [...]
I am leaning towards adding this to asm/memory.h #ifdef CONFIG_KASAN #define KASAN_SHADOW_SIZE (UL(1) << (VA_BITS - 3)) #else #define KASAN_SHADOW_SIZE (0) #endif and remove the #ifdef CONFIG_KASAN block from asm/pgtable.h. Then asm/kasan.h, which already includes asm/memory.h, can use it as region size, and none of the reshuffling I had to do before is necessary.
quoted
quoted
quoted
+ vmlinux_vm.flags = VM_MAP;I was going to say we should set VM_KASAN also per its description in include/vmalloc.h, though per its uses its not clear if it will ever matter.No, we shouldn't. Even if we are never going to unmap this vma, setting the flag will result in the shadow area being freed using vfree(), while it was not allocated via vmalloc() so that is likely to cause trouble.Ok.quoted
quoted
quoted
+ vm_area_add_early(&vmlinux_vm);Do we need to register the kernel VA range quite this early, or could we do this around paging_init/map_kernel time?No. Locally, I moved it into map_kernel_chunk, so that we have separate areas for _text, _init and _data, and we can unmap the _init entirely rather than only stripping the exec bit. I haven't quite figured out how to get rid of the vma area, but perhaps it make sense to keep it reserved, so that modules don't end up there later (which is possible with the module region randomization I have implemented for v4) since I don't know how well things like kallsyms etc cope with that.Keeping that reserved sounds reasonable to me. [...]quoted
quoted
quoted
void __init kasan_init(void) { + u64 kimg_shadow_start, kimg_shadow_end; struct memblock_region *reg; + kimg_shadow_start = round_down((u64)kasan_mem_to_shadow(_text), + SWAPPER_BLOCK_SIZE); + kimg_shadow_end = round_up((u64)kasan_mem_to_shadow(_end), + SWAPPER_BLOCK_SIZE);This rounding looks suspect to me, given it's applied to the shadow addresses rather than the kimage addresses. That's roughly equivalent to kasan_mem_to_shadow(round_up(_end, 8 * SWAPPER_BLOCK_SIZE). I don't think we need any rounding for the kimage addresses. The image end is page-granular (and the fine-grained mapping will reflect that). Any accesses between _end and roud_up(_end, SWAPPER_BLOCK_SIZE) would be bugs (and would most likely fault) regardless of KASAN. Or am I just being thick here?Well, the problem here is that vmemmap_populate() is used as a surrogate vmalloc() since that is not available yet, and vmemmap_populate() allocates in SWAPPER_BLOCK_SIZE granularity. If I remove the rounding, I get false positive kasan errors which I have not quite diagnosed yet, but are probably due to the fact that the rounding performed by vmemmap_populate() goes in the wrong direction.Ah. :( I'll also take a peek.
Yes, please.
quoted
I do wonder what that means for memblocks that are not multiples of 16 MB, though (below)Indeed. On a related note, something I've been thinking about is PA layout fuzzing using VMs. It sounds like being able to test memory layouts would be useful for cases like the above, and I suspect there are plenty of other edge cases that we aren't yet aware of due to typical physical memory layouts being fairly simple. It doesn't seem to be possible to force a particular physical memory layout (and particular kernel, dtb, etc addresses) for QEMU or KVM tool. I started looking into adding support to KVM tool, but there's a fair amount of refactoring needed first. Another option might be a special EFI application that carves up memory in a deliberate fashion to ensure particular fragmentation cases (e.g. a bank that's SWAPPER_BLOCK_SIZE - PAGE_SIZE in length).
I use mem= for this, in fact, and boot most of my machines and VMs with some value slightly below the actual available DRAM that is not a multiple of 2M