Thread (6 messages) 6 messages, 2 authors, 2015-08-21

[PATCH v2 1/2] ARM: dma-mapping: Don't use outer_flush_range when the L2C is coherent

From: catalin.marinas@arm.com (Catalin Marinas)
Date: 2015-08-14 17:43:46
Also in: stable

On Fri, Aug 14, 2015 at 06:12:55PM +0200, Gregory CLEMENT wrote:
When a L2 cache controller is used in a system that provides hardware
coherency, the entire outer cache operations are useless, and can be
skipped.  Moreover, on some systems, it is harmful as it causes
deadlocks between the Marvell coherency mechanism, the Marvell PCIe
controller and the Cortex-A9.

In the current kernel implementation, the outer cache flush range
operation is triggered by the dma_alloc function.
This operation can be take place during runtime and in some
circumstances may lead to the PCIe/PL310 deadlock on Armada 375/38x
SoCs.

This patch extends the __dma_clear_buffer() function to receive a
boolean argument related to the coherency of the system. The same
things is done for the calling functions.

Reported-by: Nadav Haklai <redacted>
Signed-off-by: Gregory CLEMENT <redacted>
Cc: <redacted> # v3.16+
Most likely the patch won't apply on kernels before 4.3. But that's
fine, you can send separate patches to stable that work on those kernels
and reference the mainline commit.
quoted hunk ↗ jump to hunk
@@ -358,9 +364,14 @@ static int __init atomic_pool_init(void)
 	if (!atomic_pool)
 		goto out;
 
+	/*
+	 * Here we don't know if the system is coherent, but it is
+	 * harmless to use the non-coherent case in the
+	 * __alloc_from_contiguous() call.
+	 */
 	if (dev_get_cma_area(NULL))
 		ptr = __alloc_from_contiguous(NULL, atomic_pool_size, prot,
-					      &page, atomic_pool_init, true);
+					&page, atomic_pool_init, true, false);
This comment is not entirely correct. The atomic pool is only used for
non-coherent allocations (the prot is pgprot_dmacoherent which means
Normal Non-cacheable memory). So we must pass false for is_coherent.
quoted hunk ↗ jump to hunk
@@ -474,7 +485,8 @@ static void *__alloc_remap_buffer(struct device *dev, size_t size, gfp_t gfp,
 {
 	struct page *page;
 	void *ptr = NULL;
-	page = __dma_alloc_buffer(dev, size, gfp);
+	/* This function is only called when the arch is non-coherent */
+	page = __dma_alloc_buffer(dev, size, gfp, false);
Nitpick: "__alloc_remap_buffer() is only called when the arch is
non-coherent". That's to avoid misreading as __dma_alloc_buffer() being
called only for non-coherent cases.

BTW, for other comments as well, instead of "arch" being coherent or not
I would say "device". We check this property on a per-device basis.
quoted hunk ↗ jump to hunk
@@ -601,7 +614,8 @@ static void *__alloc_simple_buffer(struct device *dev, size_t size, gfp_t gfp,
 				   struct page **ret_page)
 {
 	struct page *page;
-	page = __dma_alloc_buffer(dev, size, gfp);
+	/* This function is only called when the arch is coherent */
+	page = __dma_alloc_buffer(dev, size, gfp, true);
Same nitpick as above, "__alloc_simple_buffer() is only called...".

Few minor things above on comments, otherwise:

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help