Thread (114 messages) 114 messages, 7 authors, 2020-10-08
STALE2081d REVIEWED: 8 (6M)
Revisions (5)
  1. v1 [diff vs current]
  2. v2 [diff vs current]
  3. v3 [diff vs current]
  4. v4 [diff vs current]
  5. v5 current

[PATCH v5 05/80] drm/vc4: plane: Optimize the LBM allocation size

From: Maxime Ripard <hidden>
Date: 2020-09-03 08:12:37
Also in: dri-devel, lkml
Subsystem: drm drivers, drm drivers and misc gpu patches, drm drivers for vc4, the rest · Maintainers: David Airlie, Simona Vetter, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, Dave Stevenson, Linus Torvalds

From: Dave Stevenson <dave.stevenson@raspberrypi.com>

The current code is using the maximum of the source line size and the
destination line size to compute the size of the LBM to allocate.

While this is simpler, it starts to be an issue with modes such as 4k with
a quite long that will consume all the available memory, so we no longer
have that luxury.

Tested-by: Chanwoo Choi <cw00.choi@samsung.com>
Tested-by: Hoegeun Kwon <redacted>
Tested-by: Stefan Wahren <redacted>
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Signed-off-by: Maxime Ripard <redacted>
---
 drivers/gpu/drm/vc4/vc4_plane.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/vc4/vc4_plane.c b/drivers/gpu/drm/vc4/vc4_plane.c
index d0771ebd5f75..23916814b4e3 100644
--- a/drivers/gpu/drm/vc4/vc4_plane.c
+++ b/drivers/gpu/drm/vc4/vc4_plane.c
@@ -437,10 +437,7 @@ static void vc4_write_ppf(struct vc4_plane_state *vc4_state, u32 src, u32 dst)
 static u32 vc4_lbm_size(struct drm_plane_state *state)
 {
 	struct vc4_plane_state *vc4_state = to_vc4_plane_state(state);
-	/* This is the worst case number.  One of the two sizes will
-	 * be used depending on the scaling configuration.
-	 */
-	u32 pix_per_line = max(vc4_state->src_w[0], (u32)vc4_state->crtc_w);
+	u32 pix_per_line;
 	u32 lbm;
 
 	/* LBM is not needed when there's no vertical scaling. */
@@ -448,6 +445,18 @@ static u32 vc4_lbm_size(struct drm_plane_state *state)
 	    vc4_state->y_scaling[1] == VC4_SCALING_NONE)
 		return 0;
 
+	/*
+	 * This can be further optimized in the RGB/YUV444 case if the PPF
+	 * decimation factor is between 0.5 and 1.0 by using crtc_w.
+	 *
+	 * It's not an issue though, since in that case since src_w[0] is going
+	 * to be greater than or equal to crtc_w.
+	 */
+	if (vc4_state->x_scaling[0] == VC4_SCALING_TPZ)
+		pix_per_line = vc4_state->crtc_w;
+	else
+		pix_per_line = vc4_state->src_w[0];
+
 	if (!vc4_state->is_yuv) {
 		if (vc4_state->y_scaling[0] == VC4_SCALING_TPZ)
 			lbm = pix_per_line * 8;
-- 
git-series 0.9.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help