Thread (19 messages) 19 messages, 3 authors, 2020-11-19

Re: [PATCH v6 1/7] arm64: mm: Move reserve_crashkernel() into mem_init()

From: Nicolas Saenz Julienne <hidden>
Date: 2020-11-19 14:10:23
Also in: linux-devicetree, linux-iommu, lkml

Hi Catalin, James,
sorry for the late reply but I got sidetracked.

On Fri, 2020-11-13 at 11:29 +0000, Catalin Marinas wrote:
[...]
quoted hunk ↗ jump to hunk
quoted
quoted
quoted
Let me stress that knowing the DMA constraints in the system before reserving
crashkernel's regions is necessary if we ever want it to work seamlessly on all
platforms. Be it small stuff like the Raspberry Pi or huge servers with TB of
memory.
Indeed. So we have 3 options (so far):

1. Allow the crashkernel reservation to go into the linear map but set
   it to invalid once allocated.

2. Parse the flattened DT (not sure what we do with ACPI) before
   creating the linear map. We may have to rely on some SoC ID here
   instead of actual DMA ranges.

3. Assume the smallest ZONE_DMA possible on arm64 (1GB) for crashkernel
   reservations and not rely on arm64_dma_phys_limit in
   reserve_crashkernel().

I think (2) we tried hard to avoid. Option (3) brings us back to the
issues we had on large crashkernel reservations regressing on some
platforms (though it's been a while since, they mostly went quiet ;)).
However, with Chen's crashkernel patches we end up with two
reservations, one in the low DMA zone and one higher, potentially above
4GB. Having a fixed 1GB limit wouldn't be any worse for crashkernel
reservations than what we have now.

If (1) works, I'd go for it (James knows this part better than me),
otherwise we can go for (3).
Overall, I'd prefer (1) as well, and I'd be happy to have a got at it. If not
I'll append (3) in this series.
I think for 1 we could also remove the additional KEXEC_CORE checks,
something like below, untested:
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 3e5a6913acc8..27ab609c1c0c 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -477,7 +477,8 @@ static void __init map_mem(pgd_t *pgdp)
 	int flags = 0;
 	u64 i;
 
-	if (rodata_full || debug_pagealloc_enabled())
+	if (rodata_full || debug_pagealloc_enabled() ||
+	    IS_ENABLED(CONFIG_KEXEC_CORE))
 		flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
 
 	/*
@@ -487,11 +488,6 @@ static void __init map_mem(pgd_t *pgdp)
 	 * the following for-loop
 	 */
 	memblock_mark_nomap(kernel_start, kernel_end - kernel_start);
-#ifdef CONFIG_KEXEC_CORE
-	if (crashk_res.end)
-		memblock_mark_nomap(crashk_res.start,
-				    resource_size(&crashk_res));
-#endif
 
 	/* map all the memory banks */
 	for_each_mem_range(i, &start, &end) {
@@ -518,21 +514,6 @@ static void __init map_mem(pgd_t *pgdp)
 	__map_memblock(pgdp, kernel_start, kernel_end,
 		       PAGE_KERNEL, NO_CONT_MAPPINGS);
 	memblock_clear_nomap(kernel_start, kernel_end - kernel_start);
-
-#ifdef CONFIG_KEXEC_CORE
-	/*
-	 * Use page-level mappings here so that we can shrink the region
-	 * in page granularity and put back unused memory to buddy system
-	 * through /sys/kernel/kexec_crash_size interface.
-	 */
-	if (crashk_res.end) {
-		__map_memblock(pgdp, crashk_res.start, crashk_res.end + 1,
-			       PAGE_KERNEL,
-			       NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS);
-		memblock_clear_nomap(crashk_res.start,
-				     resource_size(&crashk_res));
-	}
-#endif
 }
 
 void mark_rodata_ro(void)
So as far as I'm concerned this is good enough for me. I took the time to
properly test crashkernel on RPi4 using the series, this patch, and another
small fix to properly update /proc/iomem.

I'll send v7 soon, but before, James (or anyone for that matter) any obvious
push-back to Catalin's solution?

Regards,
Nicolas
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help