Thread (3 messages) 3 messages, 2 authors, 2021-10-04

Re: Re: [PATCH] media: cedrus: Don't kernel map most buffers

From: Jernej Škrabec <jernej.skrabec@gmail.com>
Date: 2021-10-04 16:06:33
Also in: linux-media, linux-staging, linux-sunxi, lkml

Hi!

Dne ponedeljek, 04. oktober 2021 ob 10:10:55 CEST je Hans Verkuil napisal(a):
Hi Jernej,

Some comments below:

On 12/09/2021 08:08, Jernej Skrabec wrote:
quoted
Except VP8 probability coefficients buffer, all other buffers are never
Except -> Except for
Ok.
quoted
accessed by CPU. That allows us to mark them with 
DMA_ATTR_NO_KERNEL_MAPPING
quoted
flag. This helps with decoding big (like 4k) videos on 32-bit ARM
platforms where default vmalloc size is relatively small - 240 MiB.
Since auxiliary buffer are not yet efficiently allocated, this can be
easily exceeded. Even if allocation is optimized, 4k videos will still
often exceed this limit.

Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
---
 .../staging/media/sunxi/cedrus/cedrus_h264.c  | 102 ++++++++++--------
 .../staging/media/sunxi/cedrus/cedrus_h265.c  |  28 ++---
 .../staging/media/sunxi/cedrus/cedrus_video.c |   2 +
 3 files changed, 73 insertions(+), 59 deletions(-)
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h264.c b/drivers/
staging/media/sunxi/cedrus/cedrus_h264.c
quoted
index de7442d4834d..6e38b37d9fe1 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
@@ -538,23 +538,23 @@ static int cedrus_h264_start(struct cedrus_ctx *ctx)
 
 	ctx->codec.h264.pic_info_buf_size = pic_info_size;
 	ctx->codec.h264.pic_info_buf =
-		dma_alloc_coherent(dev->dev, ctx-
codec.h264.pic_info_buf_size,
quoted
-				   &ctx-
codec.h264.pic_info_buf_dma,
quoted
-				   GFP_KERNEL);
+		dma_alloc_attrs(dev->dev, ctx-
codec.h264.pic_info_buf_size,
quoted
+				&ctx-
codec.h264.pic_info_buf_dma,
quoted
+				GFP_KERNEL, 
DMA_ATTR_NO_KERNEL_MAPPING);
quoted
 	if (!ctx->codec.h264.pic_info_buf)
 		return -ENOMEM;
 
 	/*
 	 * That buffer is supposed to be 16kiB in size, and be aligned
-	 * on 16kiB as well. However, dma_alloc_coherent provides the
+	 * on 16kiB as well. However, dma_alloc_attrs provides the
 	 * guarantee that we'll have a CPU and DMA address aligned on
Does the 'CPU' part of this sentence still make sense since the CPU
won't access the buffer?
Good catch! I'll remove it.
quoted
 	 * the smallest page order that is greater to the requested
 	 * size, so we don't have to overallocate.
 	 */
 	ctx->codec.h264.neighbor_info_buf =
-		dma_alloc_coherent(dev->dev, 
CEDRUS_NEIGHBOR_INFO_BUF_SIZE,
quoted
-				   &ctx-
codec.h264.neighbor_info_buf_dma,
quoted
-				   GFP_KERNEL);
+		dma_alloc_attrs(dev->dev, 
CEDRUS_NEIGHBOR_INFO_BUF_SIZE,
quoted
+				&ctx-
codec.h264.neighbor_info_buf_dma,
quoted
+				GFP_KERNEL, 
DMA_ATTR_NO_KERNEL_MAPPING);
I think it would be good to have a comment for all these dma_alloc_attrs
calls where you note that these buffers are used by the HW only, and
never by the CPU, hence the use of DMA_ATTR_NO_KERNEL_MAPPING.
Ok.
quoted
 	if (!ctx->codec.h264.neighbor_info_buf) {
 		ret = -ENOMEM;
 		goto err_pic_buf;
@@ -582,10 +582,11 @@ static int cedrus_h264_start(struct cedrus_ctx *ctx)
 
 	mv_col_size = field_size * 2 * CEDRUS_H264_FRAME_NUM;
 	ctx->codec.h264.mv_col_buf_size = mv_col_size;
-	ctx->codec.h264.mv_col_buf = dma_alloc_coherent(dev->dev,
-							
ctx->codec.h264.mv_col_buf_size,
quoted
-							
&ctx->codec.h264.mv_col_buf_dma,
quoted
-							
GFP_KERNEL);
quoted
+	ctx->codec.h264.mv_col_buf =
+		dma_alloc_attrs(dev->dev,
+				ctx->codec.h264.mv_col_buf_size,
+				&ctx->codec.h264.mv_col_buf_dma,
+				GFP_KERNEL, 
DMA_ATTR_NO_KERNEL_MAPPING);
quoted
 	if (!ctx->codec.h264.mv_col_buf) {
 		ret = -ENOMEM;
 		goto err_neighbor_buf;
@@ -600,10 +601,10 @@ static int cedrus_h264_start(struct cedrus_ctx *ctx)
 		ctx->codec.h264.deblk_buf_size =
 			ALIGN(ctx->src_fmt.width, 32) * 12;
 		ctx->codec.h264.deblk_buf =
-			dma_alloc_coherent(dev->dev,
-					   ctx-
codec.h264.deblk_buf_size,
quoted
-					   &ctx-
codec.h264.deblk_buf_dma,
quoted
-					   GFP_KERNEL);
+			dma_alloc_attrs(dev->dev,
+					ctx-
codec.h264.deblk_buf_size,
quoted
+					&ctx-
codec.h264.deblk_buf_dma,
quoted
+					GFP_KERNEL, 
DMA_ATTR_NO_KERNEL_MAPPING);
quoted
 		if (!ctx->codec.h264.deblk_buf) {
 			ret = -ENOMEM;
 			goto err_mv_col_buf;
@@ -616,10 +617,10 @@ static int cedrus_h264_start(struct cedrus_ctx *ctx)
 		ctx->codec.h264.intra_pred_buf_size =
 			ALIGN(ctx->src_fmt.width, 64) * 5 * 2;
 		ctx->codec.h264.intra_pred_buf =
-			dma_alloc_coherent(dev->dev,
-					   ctx-
codec.h264.intra_pred_buf_size,
quoted
-					   &ctx-
codec.h264.intra_pred_buf_dma,
quoted
-					   GFP_KERNEL);
+			dma_alloc_attrs(dev->dev,
+					ctx-
codec.h264.intra_pred_buf_size,
quoted
+					&ctx-
codec.h264.intra_pred_buf_dma,
quoted
+					GFP_KERNEL, 
DMA_ATTR_NO_KERNEL_MAPPING);
quoted
 		if (!ctx->codec.h264.intra_pred_buf) {
 			ret = -ENOMEM;
 			goto err_deblk_buf;
@@ -629,24 +630,28 @@ static int cedrus_h264_start(struct cedrus_ctx *ctx)
 	return 0;
 
 err_deblk_buf:
-	dma_free_coherent(dev->dev, ctx->codec.h264.deblk_buf_size,
-			  ctx->codec.h264.deblk_buf,
-			  ctx->codec.h264.deblk_buf_dma);
+	dma_free_attrs(dev->dev, ctx->codec.h264.deblk_buf_size,
+		       ctx->codec.h264.deblk_buf,
+		       ctx->codec.h264.deblk_buf_dma,
+		       DMA_ATTR_NO_KERNEL_MAPPING);
 
 err_mv_col_buf:
-	dma_free_coherent(dev->dev, ctx->codec.h264.mv_col_buf_size,
-			  ctx->codec.h264.mv_col_buf,
-			  ctx->codec.h264.mv_col_buf_dma);
+	dma_free_attrs(dev->dev, ctx->codec.h264.mv_col_buf_size,
+		       ctx->codec.h264.mv_col_buf,
+		       ctx->codec.h264.mv_col_buf_dma,
+		       DMA_ATTR_NO_KERNEL_MAPPING);
 
 err_neighbor_buf:
-	dma_free_coherent(dev->dev, CEDRUS_NEIGHBOR_INFO_BUF_SIZE,
-			  ctx->codec.h264.neighbor_info_buf,
-			  ctx->codec.h264.neighbor_info_buf_dma);
+	dma_free_attrs(dev->dev, CEDRUS_NEIGHBOR_INFO_BUF_SIZE,
+		       ctx->codec.h264.neighbor_info_buf,
+		       ctx->codec.h264.neighbor_info_buf_dma,
+		       DMA_ATTR_NO_KERNEL_MAPPING);
 
 err_pic_buf:
-	dma_free_coherent(dev->dev, ctx->codec.h264.pic_info_buf_size,
-			  ctx->codec.h264.pic_info_buf,
-			  ctx->codec.h264.pic_info_buf_dma);
+	dma_free_attrs(dev->dev, ctx->codec.h264.pic_info_buf_size,
+		       ctx->codec.h264.pic_info_buf,
+		       ctx->codec.h264.pic_info_buf_dma,
+		       DMA_ATTR_NO_KERNEL_MAPPING);
 	return ret;
 }
 
@@ -654,23 +659,28 @@ static void cedrus_h264_stop(struct cedrus_ctx *ctx)
 {
 	struct cedrus_dev *dev = ctx->dev;
 
-	dma_free_coherent(dev->dev, ctx->codec.h264.mv_col_buf_size,
-			  ctx->codec.h264.mv_col_buf,
-			  ctx->codec.h264.mv_col_buf_dma);
-	dma_free_coherent(dev->dev, CEDRUS_NEIGHBOR_INFO_BUF_SIZE,
-			  ctx->codec.h264.neighbor_info_buf,
-			  ctx->codec.h264.neighbor_info_buf_dma);
-	dma_free_coherent(dev->dev, ctx->codec.h264.pic_info_buf_size,
-			  ctx->codec.h264.pic_info_buf,
-			  ctx->codec.h264.pic_info_buf_dma);
+	dma_free_attrs(dev->dev, ctx->codec.h264.mv_col_buf_size,
+		       ctx->codec.h264.mv_col_buf,
+		       ctx->codec.h264.mv_col_buf_dma,
+		       DMA_ATTR_NO_KERNEL_MAPPING);
+	dma_free_attrs(dev->dev, CEDRUS_NEIGHBOR_INFO_BUF_SIZE,
+		       ctx->codec.h264.neighbor_info_buf,
+		       ctx->codec.h264.neighbor_info_buf_dma,
+		       DMA_ATTR_NO_KERNEL_MAPPING);
+	dma_free_attrs(dev->dev, ctx->codec.h264.pic_info_buf_size,
+		       ctx->codec.h264.pic_info_buf,
+		       ctx->codec.h264.pic_info_buf_dma,
+		       DMA_ATTR_NO_KERNEL_MAPPING);
 	if (ctx->codec.h264.deblk_buf_size)
-		dma_free_coherent(dev->dev, ctx-
codec.h264.deblk_buf_size,
quoted
-				  ctx->codec.h264.deblk_buf,
-				  ctx->codec.h264.deblk_buf_dma);
+		dma_free_attrs(dev->dev, ctx-
codec.h264.deblk_buf_size,
quoted
+			       ctx->codec.h264.deblk_buf,
+			       ctx->codec.h264.deblk_buf_dma,
+			       DMA_ATTR_NO_KERNEL_MAPPING);
 	if (ctx->codec.h264.intra_pred_buf_size)
-		dma_free_coherent(dev->dev, ctx-
codec.h264.intra_pred_buf_size,
quoted
-				  ctx->codec.h264.intra_pred_buf,
-				  ctx-
codec.h264.intra_pred_buf_dma);
quoted
+		dma_free_attrs(dev->dev, ctx-
codec.h264.intra_pred_buf_size,
quoted
+			       ctx->codec.h264.intra_pred_buf,
+			       ctx->codec.h264.intra_pred_buf_dma,
+			       DMA_ATTR_NO_KERNEL_MAPPING);
 }
 
 static void cedrus_h264_trigger(struct cedrus_ctx *ctx)
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c b/drivers/
staging/media/sunxi/cedrus/cedrus_h265.c
quoted
index 3d9561d4aadb..bb7eb56106c5 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
@@ -351,10 +351,10 @@ static void cedrus_h265_setup(struct cedrus_ctx 
*ctx,
quoted
 			ctx->codec.h265.mv_col_buf_unit_size;
 
 		ctx->codec.h265.mv_col_buf =
-			dma_alloc_coherent(dev->dev,
-					   ctx-
codec.h265.mv_col_buf_size,
quoted
-					   &ctx-
codec.h265.mv_col_buf_addr,
quoted
-					   GFP_KERNEL);
+			dma_alloc_attrs(dev->dev,
+					ctx-
codec.h265.mv_col_buf_size,
quoted
+					&ctx-
codec.h265.mv_col_buf_addr,
quoted
+					GFP_KERNEL, 
DMA_ATTR_NO_KERNEL_MAPPING);
quoted
 		if (!ctx->codec.h265.mv_col_buf) {
 			ctx->codec.h265.mv_col_buf_size = 0;
 			// TODO: Abort the process here.
@@ -668,9 +668,9 @@ static int cedrus_h265_start(struct cedrus_ctx *ctx)
 	ctx->codec.h265.mv_col_buf_size = 0;
 
 	ctx->codec.h265.neighbor_info_buf =
-		dma_alloc_coherent(dev->dev, 
CEDRUS_H265_NEIGHBOR_INFO_BUF_SIZE,
quoted
-				   &ctx-
codec.h265.neighbor_info_buf_addr,
quoted
-				   GFP_KERNEL);
+		dma_alloc_attrs(dev->dev, 
CEDRUS_H265_NEIGHBOR_INFO_BUF_SIZE,
quoted
+				&ctx-
codec.h265.neighbor_info_buf_addr,
quoted
+				GFP_KERNEL, 
DMA_ATTR_NO_KERNEL_MAPPING);
quoted
 	if (!ctx->codec.h265.neighbor_info_buf)
 		return -ENOMEM;
 
@@ -682,16 +682,18 @@ static void cedrus_h265_stop(struct cedrus_ctx *ctx)
 	struct cedrus_dev *dev = ctx->dev;
 
 	if (ctx->codec.h265.mv_col_buf_size > 0) {
-		dma_free_coherent(dev->dev, ctx-
codec.h265.mv_col_buf_size,
quoted
-				  ctx->codec.h265.mv_col_buf,
-				  ctx-
codec.h265.mv_col_buf_addr);
quoted
+		dma_free_attrs(dev->dev, ctx-
codec.h265.mv_col_buf_size,
quoted
+			       ctx->codec.h265.mv_col_buf,
+			       ctx->codec.h265.mv_col_buf_addr,
+			       DMA_ATTR_NO_KERNEL_MAPPING);
 
 		ctx->codec.h265.mv_col_buf_size = 0;
 	}
 
-	dma_free_coherent(dev->dev, CEDRUS_H265_NEIGHBOR_INFO_BUF_SIZE,
-			  ctx->codec.h265.neighbor_info_buf,
-			  ctx->codec.h265.neighbor_info_buf_addr);
+	dma_free_attrs(dev->dev, CEDRUS_H265_NEIGHBOR_INFO_BUF_SIZE,
+		       ctx->codec.h265.neighbor_info_buf,
+		       ctx->codec.h265.neighbor_info_buf_addr,
+		       DMA_ATTR_NO_KERNEL_MAPPING);
 }
 
 static void cedrus_h265_trigger(struct cedrus_ctx *ctx)
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_video.c b/drivers/
staging/media/sunxi/cedrus/cedrus_video.c
quoted
index 66714609b577..800ffa5382de 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_video.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_video.c
@@ -568,6 +568,7 @@ int cedrus_queue_init(void *priv, struct vb2_queue 
*src_vq,
quoted
 
 	src_vq->type = V4L2_BUF_TYPE_VIDEO_OUTPUT;
 	src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
+	src_vq->dma_attrs = DMA_ATTR_NO_KERNEL_MAPPING;
 	src_vq->drv_priv = ctx;
 	src_vq->buf_struct_size = sizeof(struct cedrus_buffer);
 	src_vq->ops = &cedrus_qops;
@@ -584,6 +585,7 @@ int cedrus_queue_init(void *priv, struct vb2_queue 
*src_vq,
quoted
 
 	dst_vq->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
 	dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
+	src_vq->dma_attrs = DMA_ATTR_NO_KERNEL_MAPPING;
This should be dst_vq!

I'm also not quite sure that it is right to use this for the destination
buffer. Could this cause problems if this buffer is exported to a dmabuf
and another driver will be using it and requiring a kernel mapping.
Hm, if I fix this, it actually breaks DRM mapping. For some reason it expects 
vmap to be valid. I'll drop this for now.

Thanks for review!

Best regards,
Jernej
Regards,

	Hans
quoted
 	dst_vq->drv_priv = ctx;
 	dst_vq->buf_struct_size = sizeof(struct cedrus_buffer);
 	dst_vq->ops = &cedrus_qops;


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help