Re: [PATCH 0/5 v11] KASan for Arm
From: Linus Walleij <hidden>
Date: 2020-07-10 15:11:33
Hi Florian, Ard, I have managed to get closer to the problem I am seeing on the Qualcomm board. I have some idea of what might be wrong. On Wed, Jul 1, 2020 at 10:16 PM Florian Fainelli [off-list ref] wrote:
This branch works a bit better however I am still seeing some boot errors (some sample logs attached) similar to Linus' branch.
Can you instrument the code in arch/arm/mm/mmu.c something like this:
@@ -1296,13 +1300,23 @@ static inline void prepare_page_table(void) * Find the end of the first block of lowmem. */ end = memblock.memory.regions[0].base + memblock.memory.regions[0].size; - if (end >= arm_lowmem_limit) + if (end >= arm_lowmem_limit) { + pr_info("Memblock end is above arm_lowmem_limit (0x%08x)\n", + arm_lowmem_limit); end = arm_lowmem_limit; + } + pr_info("Memblock[0].base: 0x%08x size: 0x%08x, end: 0x%08x\n", + memblock.memory.regions[0].base, + memblock.memory.regions[0].size, + end); /* * Clear out all the kernel space mappings, except for the first * memory bank, up to the vmalloc region. */ + pr_info("Clear PMDs from 0x%08lx to 0x%08lx (lowmem)\n", + __phys_to_virt(end), + VMALLOC_START); for (addr = __phys_to_virt(end); addr < VMALLOC_START; addr += PMD_SIZE) pmd_clear(pmd_off_k(addr));
And then look what addresses this clears. What I noticed with the APQ8060 board was that the FDT I could not access was under some circumstances inside the lowmem that gets its PMDs cleared in prepare_page_table() and that is why paging fails. So when you get a crash like this: Unable to handle kernel paging request at virtual address bcdffe00 That can be inside a memory block that the kernel has now designated as the first block of lowmem. I am not entirely sure why this happens, but it seems unrelated to the KASan shadow memory, and seems to be a side effect of the fact that the kernel vmlinux size grows when using KASan so when the kernel is loaded in certain locations the kernel is so big that it actually doesn't fit into the first memory block and runs into the lowmem. I get the impression that the kernel binary *must* fit into the first memblock. If this is the root cause, what we need to to is to properly identify this case and print an error if it happens. Here is how it looks on the Qualcomm APQ8060: I add prints for which address ranges get their PMDs get cleared in arch/arm/mm/mmu.c, function prepare_page_table(), and print out where the early FDT parser looks for the attached device tree. I see the area above MODULES_VADDR getting cleared (in the case of KASan the shadow memory is skipped over) and then the first memblock is inspected to locate the start of the first block of lowmem. This is memblock[0]. I then test to load the kernel to both 0x40200000 and 0x50000000 since the latter always works - something must be funny with the way the kernel handles the odd memory blocks. Indeed: Mainline load kernel at 0x40200000 works: Memblock[0].base: 0x40200000 size: 0x02c00000, end: 0x42e00000 Clear PMDs from 0xc2e00000 to 0xe0800000 (lowmem) (...) OF: fdt: unflatten_and_copy_device_tree initial boot params at c22e3d88 OF: fdt: unflatten_and_copy_device_tree: totalsize: 21058 Mainline load kernel at 0x50000000 works: fdt: Ignoring memory block 0x40200000 - 0x42e00000 fdt: Ignoring memory range 0x48000000 - 0x50000000 (...) Memblock end is above arm_lowmem_limit (0x60000000) Memblock[0].base: 0x50000000 size: 0x10000000, end: 0x60000000 Clear PMDs from 0xd0000000 to 0xd0800000 (lowmem) (...) OF: fdt: unflatten_and_copy_device_tree initial boot params at c22e3d88 OF: fdt: unflatten_and_copy_device_tree: totalsize: 21090 KASan-enabled, load kernel at 0x40200000 bugs: Memblock[0].base: 0x40200000 size: 0x02c00000, end: 0x42e00000 Clear PMDs from 0xc2e00000 to 0xe0800000 (lowmem) (...) OF: fdt: unflatten_and_copy_device_tree initial boot params at c3078710 Unable to handle kernel paging request at virtual address c3078714 (crash) KASan-enabled, load kernel at 0x50000000 works: fdt: Ignoring memory block 0x40200000 - 0x42e00000 fdt: Ignoring memory range 0x48000000 - 0x50000000 (...) Memblock end is above arm_lowmem_limit (0x60000000) Memblock[0].base: 0x50000000 size: 0x10000000, end: 0x60000000 Clear PMDs from 0xd0000000 to 0xd0800000 (lowmem) (...) OF: fdt: unflatten_and_copy_device_tree initial boot params at c3078710 OF: fdt: unflatten_and_copy_device_tree: totalsize: 21090 Yours, Linus Walleij _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel