Thread (19 messages) 19 messages, 8 authors, 2020-10-28

Re: [PATCH v4 5/7] arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges

From: Nicolas Saenz Julienne <hidden>
Date: 2020-10-23 15:27:54
Also in: linux-devicetree, linux-iommu, lkml

Hi Catalin,

On Thu, 2020-10-22 at 19:06 +0100, Catalin Marinas wrote:
On Wed, Oct 21, 2020 at 02:34:35PM +0200, Nicolas Saenz Julienne wrote:
quoted
@@ -188,9 +186,11 @@ static phys_addr_t __init max_zone_phys(unsigned int zone_bits)
 static void __init zone_sizes_init(unsigned long min, unsigned long max)
 {
 	unsigned long max_zone_pfns[MAX_NR_ZONES]  = {0};
+	unsigned int __maybe_unused dt_zone_dma_bits;
 
 #ifdef CONFIG_ZONE_DMA
-	zone_dma_bits = ARM64_ZONE_DMA_BITS;
+	dt_zone_dma_bits = ilog2(of_dma_get_max_cpu_address(NULL));
+	zone_dma_bits = min(32U, dt_zone_dma_bits);
A thought: can we remove the min here and expand ZONE_DMA to whatever
dt_zone_dma_bits says? More on this below.
On most platforms we'd get PHYS_ADDR_MAX, or something bigger than the actual
amount of RAM. Which would ultimately create a system wide ZONE_DMA. At first
sight, I don't see it breaking dma-direct in any way.

On the other hand, there is a big amount of MMIO devices out there that can
only handle 32-bit addressing. Be it PCI cards or actual IP cores. To make
things worse, this limitation is often expressed in the driver, not FW (with
dma_set_mask() and friends). If those devices aren't behind an IOMMU we have be
able to provide at least 32-bit addressable memory. See this comment from
dma_direct_supported():

/*
 * Because 32-bit DMA masks are so common we expect every architecture
 * to be able to satisfy them - either by not supporting more physical
 * memory, or by providing a ZONE_DMA32.  If neither is the case, the
 * architecture needs to use an IOMMU instead of the direct mapping.
 */

I think, for the common case, we're stuck with at least one zone spanning the
32-bit address space.
quoted hunk ↗ jump to hunk
quoted
 	arm64_dma_phys_limit = max_zone_phys(zone_dma_bits);
 	max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit);
 #endif
I was talking earlier to Ard and Robin on the ZONE_DMA32 history and the
need for max_zone_phys(). This was rather theoretical, the Seattle
platform has all RAM starting above 4GB and that led to an empty
ZONE_DMA32 originally. The max_zone_phys() hack was meant to lift
ZONE_DMA32 into the bottom of the RAM, on the assumption that such
32-bit devices would have a DMA offset hardwired. We are not aware of
any such case on arm64 systems and even on Seattle, IIUC 32-bit devices
only work if they are behind an SMMU (so no hardwired offset).

In hindsight, it would have made more sense on platforms with RAM above
4GB to expand ZONE_DMA32 to cover the whole memory (so empty
ZONE_NORMAL). Something like:
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index a53c1e0fb017..7d5e3dd85617 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -187,8 +187,12 @@ static void __init reserve_elfcorehdr(void)
  */
 static phys_addr_t __init max_zone_phys(unsigned int zone_bits)
 {
-	phys_addr_t offset = memblock_start_of_DRAM() & GENMASK_ULL(63, zone_bits);
-	return min(offset + (1ULL << zone_bits), memblock_end_of_DRAM());
+	phys_addr_t zone_mask = 1ULL << zone_bits;
+
+	if (!(memblock_start_of_DRAM() & zone_mask))
+		zone_mask = PHYS_ADDR_MAX;
+
+	return min(zone_mask, memblock_end_of_DRAM());
 }
 
 static void __init zone_sizes_init(unsigned long min, unsigned long max)
I don't think this makes any difference for ZONE_DMA unless a
broken DT or IORT reports the max CPU address below the start of DRAM.

There's a minor issue if of_dma_get_max_cpu_address() matches
memblock_end_of_DRAM() but they are not a power of 2. We'd be left with
a bit of RAM at the end in ZONE_NORMAL due to ilog2 truncation.
I agree it makes no sense to create more than one zone when the beginning of
RAM is located above the 32-bit address space. I'm all for disregarding the
possibility of hardwired offsets. As a bonus, as we already discussed some time
ago, this is something that never played well with current dma-direct code[1].

Regards,
Nicolas

[1] https://lkml.org/lkml/2020/9/8/377
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help