Thread (7 messages) 7 messages, 4 authors, 2021-01-13

Re: [PATCH 2/2] dmaengine: ti: k3-udma: Add support for burst_size configuration for mem2mem

From: Péter Ujfalusi <peter.ujfalusi@gmail.com>
Date: 2021-01-13 07:39:22
Also in: lkml

Hi Vinod,

On 1/12/21 12:16 PM, Vinod Koul wrote:
On 14-12-20, 10:13, Peter Ujfalusi wrote:
quoted
The UDMA and BCDMA can provide higher throughput if the burst_size of the
channel is changed from it's default (which is 64 bytes) for Ultra-high
and high capacity channels.

This performance benefit is even more visible when the buffers are aligned
with the burst_size configuration.

The am654 does not have a way to change the burst size, but it is using
64 bytes burst, so increasing the copy_align from 8 bytes to 64 (and
clients taking that into account) can increase the throughput as well.

Numbers gathered on j721e:
echo 8000000 > /sys/module/dmatest/parameters/test_buf_size
echo 2000 > /sys/module/dmatest/parameters/timeout
echo 50 > /sys/module/dmatest/parameters/iterations
echo 1 > /sys/module/dmatest/parameters/max_channels

Prior this patch:       ~1.3 GB/s
After this patch:       ~1.8 GB/s
 with 1 byte alignment: ~1.7 GB/s

Signed-off-by: Peter Ujfalusi <redacted>
---
 drivers/dma/ti/k3-udma.c | 115 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 110 insertions(+), 5 deletions(-)
diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
index 87157cbae1b8..54e4ccb1b37e 100644
--- a/drivers/dma/ti/k3-udma.c
+++ b/drivers/dma/ti/k3-udma.c
@@ -121,6 +121,11 @@ struct udma_oes_offsets {
 #define UDMA_FLAG_PDMA_ACC32		BIT(0)
 #define UDMA_FLAG_PDMA_BURST		BIT(1)
 #define UDMA_FLAG_TDTYPE		BIT(2)
+#define UDMA_FLAG_BURST_SIZE		BIT(3)
+#define UDMA_FLAGS_J7_CLASS		(UDMA_FLAG_PDMA_ACC32 | \
+					 UDMA_FLAG_PDMA_BURST | \
+					 UDMA_FLAG_TDTYPE | \
+					 UDMA_FLAG_BURST_SIZE)
 
 struct udma_match_data {
 	enum k3_dma_type type;
@@ -128,6 +133,7 @@ struct udma_match_data {
 	bool enable_memcpy_support;
 	u32 flags;
 	u32 statictr_z_mask;
+	u8 burst_size[3];
 };
 
 struct udma_soc_data {
@@ -436,6 +442,18 @@ static void k3_configure_chan_coherency(struct dma_chan *chan, u32 asel)
 	}
 }
 
+static u8 udma_get_chan_tpl_index(struct udma_tpl *tpl_map, int chan_id)
+{
+	int i;
+
+	for (i = 0; i < tpl_map->levels; i++) {
+		if (chan_id >= tpl_map->start_idx[i])
+			return i;
+	}
Braces seem not required
True, they are not strictly needed but I prefer to have them when I have
any condition in the loop.
quoted
+
+	return 0;
+}
+
 static void udma_reset_uchan(struct udma_chan *uc)
 {
 	memset(&uc->config, 0, sizeof(uc->config));
@@ -1811,6 +1829,7 @@ static int udma_tisci_m2m_channel_config(struct udma_chan *uc)
 	const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
 	struct udma_tchan *tchan = uc->tchan;
 	struct udma_rchan *rchan = uc->rchan;
+	u8 burst_size = 0;
 	int ret = 0;
 
 	/* Non synchronized - mem to mem type of transfer */
@@ -1818,6 +1837,12 @@ static int udma_tisci_m2m_channel_config(struct udma_chan *uc)
 	struct ti_sci_msg_rm_udmap_tx_ch_cfg req_tx = { 0 };
 	struct ti_sci_msg_rm_udmap_rx_ch_cfg req_rx = { 0 };
 
+	if (ud->match_data->flags & UDMA_FLAG_BURST_SIZE) {
+		u8 tpl = udma_get_chan_tpl_index(&ud->tchan_tpl, tchan->id);
Can we define variable at function start please
The 'tpl' is only used within this if branch, it looks a bit cleaner
imho, but if you insist, I can move the definition.

...
quoted
+static enum dmaengine_alignment udma_get_copy_align(struct udma_dev *ud)
+{
+	const struct udma_match_data *match_data = ud->match_data;
+	u8 tpl;
+
+	if (!match_data->enable_memcpy_support)
+		return DMAENGINE_ALIGN_8_BYTES;
+
+	/* Get the highest TPL level the device supports for memcpy */
+	if (ud->bchan_cnt) {
+		tpl = udma_get_chan_tpl_index(&ud->bchan_tpl, 0);
+	} else if (ud->tchan_cnt) {
+		tpl = udma_get_chan_tpl_index(&ud->tchan_tpl, 0);
+	} else {
+		return DMAENGINE_ALIGN_8_BYTES;
+	}
Braces seem not required
Very true.
quoted
+
+	switch (match_data->burst_size[tpl]) {
+		case TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_256_BYTES:
+			return DMAENGINE_ALIGN_256_BYTES;
+		case TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_128_BYTES:
+			return DMAENGINE_ALIGN_128_BYTES;
+		case TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_64_BYTES:
+		fallthrough;
+		default:
+			return DMAENGINE_ALIGN_64_BYTES;
ah, we are supposed to have case at same indent as switch, pls run
checkpatch to have these flagged off
Yes, they should be.

The other me did a sloppy job for sure, this should have been screaming
even without checkpatch...
This has been done in a rush during the last days to close on the
backlog item which got the most votes.

-- 
Péter
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help