Re: [PATCH v6 1/7] arm64: mm: Move reserve_crashkernel() into mem_init()
From: Nicolas Saenz Julienne <hidden>
Date: 2020-11-19 14:10:23
Also in:
linux-devicetree, linux-iommu, lkml
Hi Catalin, James, sorry for the late reply but I got sidetracked. On Fri, 2020-11-13 at 11:29 +0000, Catalin Marinas wrote: [...]
quoted hunk ↗ jump to hunk
quoted
quoted
quoted
Let me stress that knowing the DMA constraints in the system before reserving crashkernel's regions is necessary if we ever want it to work seamlessly on all platforms. Be it small stuff like the Raspberry Pi or huge servers with TB of memory.Indeed. So we have 3 options (so far): 1. Allow the crashkernel reservation to go into the linear map but set it to invalid once allocated. 2. Parse the flattened DT (not sure what we do with ACPI) before creating the linear map. We may have to rely on some SoC ID here instead of actual DMA ranges. 3. Assume the smallest ZONE_DMA possible on arm64 (1GB) for crashkernel reservations and not rely on arm64_dma_phys_limit in reserve_crashkernel(). I think (2) we tried hard to avoid. Option (3) brings us back to the issues we had on large crashkernel reservations regressing on some platforms (though it's been a while since, they mostly went quiet ;)). However, with Chen's crashkernel patches we end up with two reservations, one in the low DMA zone and one higher, potentially above 4GB. Having a fixed 1GB limit wouldn't be any worse for crashkernel reservations than what we have now. If (1) works, I'd go for it (James knows this part better than me), otherwise we can go for (3).Overall, I'd prefer (1) as well, and I'd be happy to have a got at it. If not I'll append (3) in this series.I think for 1 we could also remove the additional KEXEC_CORE checks, something like below, untested:diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 3e5a6913acc8..27ab609c1c0c 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c@@ -477,7 +477,8 @@ static void __init map_mem(pgd_t *pgdp) int flags = 0; u64 i; - if (rodata_full || debug_pagealloc_enabled()) + if (rodata_full || debug_pagealloc_enabled() || + IS_ENABLED(CONFIG_KEXEC_CORE)) flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; /*@@ -487,11 +488,6 @@ static void __init map_mem(pgd_t *pgdp) * the following for-loop */ memblock_mark_nomap(kernel_start, kernel_end - kernel_start); -#ifdef CONFIG_KEXEC_CORE - if (crashk_res.end) - memblock_mark_nomap(crashk_res.start, - resource_size(&crashk_res)); -#endif /* map all the memory banks */ for_each_mem_range(i, &start, &end) {@@ -518,21 +514,6 @@ static void __init map_mem(pgd_t *pgdp) __map_memblock(pgdp, kernel_start, kernel_end, PAGE_KERNEL, NO_CONT_MAPPINGS); memblock_clear_nomap(kernel_start, kernel_end - kernel_start); - -#ifdef CONFIG_KEXEC_CORE - /* - * Use page-level mappings here so that we can shrink the region - * in page granularity and put back unused memory to buddy system - * through /sys/kernel/kexec_crash_size interface. - */ - if (crashk_res.end) { - __map_memblock(pgdp, crashk_res.start, crashk_res.end + 1, - PAGE_KERNEL, - NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS); - memblock_clear_nomap(crashk_res.start, - resource_size(&crashk_res)); - } -#endif } void mark_rodata_ro(void)
So as far as I'm concerned this is good enough for me. I took the time to properly test crashkernel on RPi4 using the series, this patch, and another small fix to properly update /proc/iomem. I'll send v7 soon, but before, James (or anyone for that matter) any obvious push-back to Catalin's solution? Regards, Nicolas