Re: [PATCH 2/2] arm: use swiotlb for bounce buffer on LPAE configs
From: Peter Ujfalusi <hidden>
Date: 2020-01-14 10:43:22
Also in:
linux-iommu, lkml
Subsystem:
arm port, the rest · Maintainers:
Russell King, Linus Torvalds
Christoph, Robin, On 09/01/2020 16.49, Christoph Hellwig wrote:
On Wed, Jan 08, 2020 at 03:20:07PM +0000, Robin Murphy wrote:quoted
quoted
The problem - I think - is that the DMA_BIT_MASK(32) from dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)) is treated as physical address along the call path so the dma_pfn_offset is applied to it and the check will fail, saying that DMA_BIT_MASK(32) can not be supported.But that's the thing - in isolation, that is entirely correct. Considering ZONE_DMA32 for simplicity, in general the zone is expected to cover the physical address range 0x0000_0000 - 0xffff_ffff (because DMA offsets are relatively rare), and a device with a dma_pfn_offset of more than (0x1_0000_0000 >> PAGE_SHIFT) *cannot* support that range with any mask, because the DMA address itself would have to be negative.Note that ZONE_DMA32 is irrelevant in this particular case, as we are talking about arm32. But with ZONE_DMA instead this roughly makes sense.quoted
The problem is that platforms with esoteric memory maps have no right thing to do. If the base of RAM is at at 0x1_0000_0000 or higher, the "correct" ZONE_DMA32 would be empty while ZONE_NORMAL above it would not, and last time I looked that makes the page allocator break badly. So the standard bodge on such platforms is to make ZONE_DMA32 cover not the first 4GB of *PA space*, but the first 4GB of *RAM*, wherever that happens to be. That then brings different problems - now the page allocator is happy and successfully returns GFP_DMA32 allocations from the range 0x8_0000_0000 - 0x8_ffff_ffff that are utterly useless to 32-bit devices with zero dma_pfn_offset - see the AMD Seattle SoC for the prime example of that. If on the other hand all devices are guaranteed to have a dma_pfn_offset that puts the base of RAM at DMA address 0 then GFP_DMA32 allocations do end up working as expected, but now the original assumption of where ZONE_DMA32 actually is is broken, so generic code unaware of the platform/architecture-specific bodge will be misled - that's the case you're running into. Having thought this far, if there's a non-hacky way to reach in and grab ZONE_DMA{32} such that dma_direct_supported() could use zone_end_pfn() instead of trying to assume either way, that might be the most robust general solution.zone_dma_bits is our somewhat ugly way to try to poke into this information, although the way it is done right now sucks pretty badly.
In my view the handling of dma_pfn_offset is just incorrect as it is applied to _any_ address.
According to DT specification dma-ranges:
"Value type: <empty> or <prop-encoded-array> encoded as an arbitrary
number of (child-bus-address, parent-bus-address, length) triplets."
Yet in drivers/of/ we only take the _first_ triplet and ignore the rest.
The dma_pfn_offset should be only applied to paddr in the range:
parent-bus-address to parent-bus-address+length
for anything outside of this the dma_pfn_offset is 0.
conversion back from dma to paddr should consider the offset in range:
child-bus-address to child-bus-address+length
and 0 for everything outside of this.
To correctly handle the dma-ranges we would need something like this in device.h:
+struct dma_ranges {
+ u64 paddr;
+ u64 dma_addr;
+ u64 size;
+ unsigned long pfn_offset;
+};
+
struct device {
...
- unsigned long dma_pfn_offset;
+ struct dma_ranges *dma_ranges;
int dma_ranges_cnt;
...
};
Then when we currently use dma_pfn_offset we would have:
unsigned long __phys_to_dma_pfn_offset(struct device *dev, phys_addr_t paddr)
{
int i;
if (!dev->dma_ranges)
return 0;
for (i = 0; i < dev->dma_ranges_cnt; i++) {
struct dma_ranges *range = &dev->dma_ranges[i];
if (paddr >= range->paddr &&
paddr <= (range->paddr + range->size))
return range->pfn_offset;
}
return 0;
}
unsigned long __dma_to_phys_pfn_offset(struct device *dev, dma_addr_t dma_addr)
{
int i;
for (i = 0; i < dev->dma_ranges_cnt; i++) {
struct dma_ranges *range = &dev->dma_ranges[i];
if (dma_addr >= range->dma_addr &&
dma_addr <= (range->dma_addr + range->size))
return range->pfn_offset;
}
return 0;
}
For existing drivers/archs setting dma_pfn_offset we can:
if (dev->dma_ranges_cnt == 1 && dev->dma_ranges[0].pfn_offset && !dev->dma_ranges[0].size)
return dev->dma_ranges[0].pfn_offset;
and they would have to set up one struct dma_ranges.
One of the issue with this is that the struct dma_ranges would need to be allocated for
all devices, so there should be a some clever way need to be invented to use pointers
as much as we can.
The patch I sent to Peter in December was trying to convey that information in a way similar to what the arm32 legacy dma code does, but it didn't work, so I'll need to find some time to sit down and figure out why.
But, while we get a proper solution can we get the following patch in to fix the regression?
Basically we are falling back to what works (and was used before commit ad3c7b18c5b362be5dbd0f2c0bcf1fd5fd659315).
commit 8c3c36b377c139603a9dff5c58dac59865f1ac0f
Author: Peter Ujfalusi [off-list ref]
Date: Thu Dec 19 15:07:25 2019 +0200
arm: mm: dma-mapping: Fix dma_supported() when dev->dma_pfn_offset is not 0
We can only use direct mapping when LPAE is enabled if the dma_pfn_offset
is 0, otherwise valid dma_masks will be rejected and the DMA support is
going to be denied for peripherals, or DMA drivers.
Cc: Stable [off-list ref] #v5.3+
Signed-off-by: Peter Ujfalusi [off-list ref]
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 9414d72f664b..e07ec1ea3865 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c@@ -1100,15 +1100,6 @@ int arm_dma_supported(struct device *dev, u64 mask) static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent) { - /* - * When CONFIG_ARM_LPAE is set, physical address can extend above - * 32-bits, which then can't be addressed by devices that only support - * 32-bit DMA. - * Use the generic dma-direct / swiotlb ops code in that case, as that - * handles bounce buffering for us. - */ - if (IS_ENABLED(CONFIG_ARM_LPAE)) - return NULL; return coherent ? &arm_coherent_dma_ops : &arm_dma_ops; }
@@ -2313,6 +2304,15 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size, if (arm_setup_iommu_dma_ops(dev, dma_base, size, iommu)) dma_ops = arm_get_iommu_dma_map_ops(coherent); + else if (IS_ENABLED(CONFIG_ARM_LPAE) && !dev->dma_pfn_offset) + /* + * When CONFIG_ARM_LPAE is set, physical address can extend + * above * 32-bits, which then can't be addressed by devices + * that only support 32-bit DMA. + * Use the generic dma-direct / swiotlb ops code in that case, + * as that handles bounce buffering for us. + */ + dma_ops = NULL; else dma_ops = arm_get_dma_map_ops(coherent); --- - Péter
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel