Re: [PATCH v4 5/7] arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges
From: Nicolas Saenz Julienne <hidden>
Date: 2020-10-23 15:27:54
Also in:
linux-devicetree, linux-iommu, lkml
Hi Catalin, On Thu, 2020-10-22 at 19:06 +0100, Catalin Marinas wrote:
On Wed, Oct 21, 2020 at 02:34:35PM +0200, Nicolas Saenz Julienne wrote:quoted
@@ -188,9 +186,11 @@ static phys_addr_t __init max_zone_phys(unsigned int zone_bits) static void __init zone_sizes_init(unsigned long min, unsigned long max) { unsigned long max_zone_pfns[MAX_NR_ZONES] = {0}; + unsigned int __maybe_unused dt_zone_dma_bits; #ifdef CONFIG_ZONE_DMA - zone_dma_bits = ARM64_ZONE_DMA_BITS; + dt_zone_dma_bits = ilog2(of_dma_get_max_cpu_address(NULL)); + zone_dma_bits = min(32U, dt_zone_dma_bits);A thought: can we remove the min here and expand ZONE_DMA to whatever dt_zone_dma_bits says? More on this below.
On most platforms we'd get PHYS_ADDR_MAX, or something bigger than the actual amount of RAM. Which would ultimately create a system wide ZONE_DMA. At first sight, I don't see it breaking dma-direct in any way. On the other hand, there is a big amount of MMIO devices out there that can only handle 32-bit addressing. Be it PCI cards or actual IP cores. To make things worse, this limitation is often expressed in the driver, not FW (with dma_set_mask() and friends). If those devices aren't behind an IOMMU we have be able to provide at least 32-bit addressable memory. See this comment from dma_direct_supported(): /* * Because 32-bit DMA masks are so common we expect every architecture * to be able to satisfy them - either by not supporting more physical * memory, or by providing a ZONE_DMA32. If neither is the case, the * architecture needs to use an IOMMU instead of the direct mapping. */ I think, for the common case, we're stuck with at least one zone spanning the 32-bit address space.
quoted hunk ↗ jump to hunk
quoted
arm64_dma_phys_limit = max_zone_phys(zone_dma_bits); max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit); #endifI was talking earlier to Ard and Robin on the ZONE_DMA32 history and the need for max_zone_phys(). This was rather theoretical, the Seattle platform has all RAM starting above 4GB and that led to an empty ZONE_DMA32 originally. The max_zone_phys() hack was meant to lift ZONE_DMA32 into the bottom of the RAM, on the assumption that such 32-bit devices would have a DMA offset hardwired. We are not aware of any such case on arm64 systems and even on Seattle, IIUC 32-bit devices only work if they are behind an SMMU (so no hardwired offset). In hindsight, it would have made more sense on platforms with RAM above 4GB to expand ZONE_DMA32 to cover the whole memory (so empty ZONE_NORMAL). Something like:diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index a53c1e0fb017..7d5e3dd85617 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c@@ -187,8 +187,12 @@ static void __init reserve_elfcorehdr(void) */ static phys_addr_t __init max_zone_phys(unsigned int zone_bits) { - phys_addr_t offset = memblock_start_of_DRAM() & GENMASK_ULL(63, zone_bits); - return min(offset + (1ULL << zone_bits), memblock_end_of_DRAM()); + phys_addr_t zone_mask = 1ULL << zone_bits; + + if (!(memblock_start_of_DRAM() & zone_mask)) + zone_mask = PHYS_ADDR_MAX; + + return min(zone_mask, memblock_end_of_DRAM()); } static void __init zone_sizes_init(unsigned long min, unsigned long max)I don't think this makes any difference for ZONE_DMA unless a broken DT or IORT reports the max CPU address below the start of DRAM. There's a minor issue if of_dma_get_max_cpu_address() matches memblock_end_of_DRAM() but they are not a power of 2. We'd be left with a bit of RAM at the end in ZONE_NORMAL due to ilog2 truncation.
I agree it makes no sense to create more than one zone when the beginning of RAM is located above the 32-bit address space. I'm all for disregarding the possibility of hardwired offsets. As a bonus, as we already discussed some time ago, this is something that never played well with current dma-direct code[1]. Regards, Nicolas [1] https://lkml.org/lkml/2020/9/8/377