[PATCH v3 2/4] dmaengine: Add STM32 MDMA driver

From: Vinod Koul <hidden>
Date: 2017-08-23 15:57:28
Also in: linux-devicetree, lkml

On Tue, Aug 22, 2017 at 05:59:26PM +0200, Pierre Yves MORDRET wrote:


On 08/16/2017 06:47 PM, Vinod Koul wrote:

quoted

On Wed, Jul 26, 2017 at 11:48:20AM +0200, Pierre-Yves MORDRET wrote:

quoted

+/* MDMA Channel x transfer configuration register */
+#define STM32_MDMA_CTCR(x)		(0x50 + 0x40 * (x))
+#define STM32_MDMA_CTCR_BWM		BIT(31)
+#define STM32_MDMA_CTCR_SWRM		BIT(30)
+#define STM32_MDMA_CTCR_TRGM_MSK	GENMASK(29, 28)
+#define STM32_MDMA_CTCR_TRGM(n)		(((n) & 0x3) << 28)
+#define STM32_MDMA_CTCR_TRGM_GET(n)	(((n) & STM32_MDMA_CTCR_TRGM_MSK) >> 28)

OK this seems oft repeated here.

So you are trying to extract the bit values and set the bit value, so why
not this do generically...

#define STM32_MDMA_SHIFT(n)		(ffs(n) - 1))
#define STM32_MDMA_SET(n, mask)		((n) << STM32_MDMA_SHIFT(mask))
#define STM32_MDMA_GET(n, mask)		(((n) && mask) >> STM32_MDMA_SHIFT(mask))

Basically, u extract the shift using the mask value and ffs helping out, so
no need to define these and reduce chances of coding errors...

OK.
but I would prefer if you don't mind

hmmm, I don't have a very strong opinion, so okay. But from programming PoV
it reduces human errors..

#define STM32_MDMA_SET(n, mask)		(((n) << STM32_MDMA_SHIFT(mask)) & mask)

quoted

+static int stm32_mdma_get_width(struct stm32_mdma_chan *chan,
+				enum dma_slave_buswidth width)
+{
+	switch (width) {
+	case DMA_SLAVE_BUSWIDTH_1_BYTE:
+	case DMA_SLAVE_BUSWIDTH_2_BYTES:
+	case DMA_SLAVE_BUSWIDTH_4_BYTES:
+	case DMA_SLAVE_BUSWIDTH_8_BYTES:
+		return ffs(width) - 1;
+	default:
+		dev_err(chan2dev(chan), "Dma bus width not supported\n");

please log the width here, helps in debug...

Hum.. just a dev_dbg to log the actual width or within the dev_err ?

latter pls

quoted

+static u32 stm32_mdma_get_best_burst(u32 buf_len, u32 tlen, u32 max_burst,
+				     enum dma_slave_buswidth width)
+{
+	u32 best_burst = max_burst;
+	u32 burst_len = best_burst * width;
+
+	while ((burst_len > 0) && (tlen % burst_len)) {
+		best_burst = best_burst >> 1;
+		burst_len = best_burst * width;
+	}
+
+	return (best_burst > 0) ? best_burst : 1;

when would best_burst <= 0? DO we really need this check

best_burst < 0 is obviously unlikely but =0 is likely whether no best burst
found. Se we do need this check.

quoted

+static struct dma_async_tx_descriptor *
+stm32_mdma_prep_dma_cyclic(struct dma_chan *c, dma_addr_t buf_addr,
+			   size_t buf_len, size_t period_len,
+			   enum dma_transfer_direction direction,
+			   unsigned long flags)
+{
+	struct stm32_mdma_chan *chan = to_stm32_mdma_chan(c);
+	struct stm32_mdma_device *dmadev = stm32_mdma_get_dev(chan);
+	struct dma_slave_config *dma_config = &chan->dma_config;
+	struct stm32_mdma_desc *desc;
+	dma_addr_t src_addr, dst_addr;
+	u32 ccr, ctcr, ctbr, count;
+	int i, ret;
+
+	if (!buf_len || !period_len || period_len > STM32_MDMA_MAX_BLOCK_LEN) {
+		dev_err(chan2dev(chan), "Invalid buffer/period len\n");
+		return NULL;
+	}
+
+	if (buf_len % period_len) {
+		dev_err(chan2dev(chan), "buf_len not multiple of period_len\n");
+		return NULL;
+	}
+
+	/*
+	 * We allow to take more number of requests till DMA is
+	 * not started. The driver will loop over all requests.
+	 * Once DMA is started then new requests can be queued only after
+	 * terminating the DMA.
+	 */
+	if (chan->busy) {
+		dev_err(chan2dev(chan), "Request not allowed when dma busy\n");
+		return NULL;
+	}

is that a HW restriction? Once a txn is completed can't we submit
subsequent txn..? Can you explain this part please.

Driver can prepare any request Slave SG, Memcpy or Cyclic. But if the channel is
busy to complete a DMA transfer, the request will be put in pending list. This
is only when the DMA transfer is going to be completed the next descriptor is
going to be processed and started.
However for cyclic this is different since when cyclic is ignited the channel
will be busy until its termination. This is why we forbid any DMA preparation
for this channel.
Nonetheless I believe we have a flaw here since we have to forbid
Slave/Memcpy/Cyclic whether a cyclic request is on-going.

But you are not submitting a txn to HW. The prepare_xxx operation prepares a
descriptor which is pushed to pending queue on submit and further pushed to
hw on queue move or issue_pending()

So here we should ideally accept the request.

After you finish memcpy you can submit a memcpy right...?

quoted

+	if (len <= STM32_MDMA_MAX_BLOCK_LEN) {
+		cbndtr |= STM32_MDMA_CBNDTR_BNDT(len);
+		if (len <= STM32_MDMA_MAX_BUF_LEN) {
+			/* Setup a buffer transfer */
+			tlen = len;
+			ccr |= STM32_MDMA_CCR_TCIE | STM32_MDMA_CCR_CTCIE;
+			ctcr |= STM32_MDMA_CTCR_TRGM(STM32_MDMA_BUFFER);
+			ctcr |= STM32_MDMA_CTCR_TLEN((tlen - 1));
+		} else {
+			/* Setup a block transfer */
+			tlen = STM32_MDMA_MAX_BUF_LEN;
+			ccr |= STM32_MDMA_CCR_BTIE | STM32_MDMA_CCR_CTCIE;
+			ctcr |= STM32_MDMA_CTCR_TRGM(STM32_MDMA_BLOCK);
+			ctcr |= STM32_MDMA_CTCR_TLEN(tlen - 1);
+		}
+
+		/* Set best burst size */
+		max_width = DMA_SLAVE_BUSWIDTH_1_BYTE;

that maynot be best.. we should have wider and longer burst for best
throughput..

Will look at that.

quoted

+	ret = device_property_read_u32(&pdev->dev, "dma-requests",
+				       &nr_requests);
+	if (ret) {
+		nr_requests = STM32_MDMA_MAX_REQUESTS;
+		dev_warn(&pdev->dev, "MDMA defaulting on %i request lines\n",
+			 nr_requests);
+	}
+
+	count = of_property_count_u32_elems(of_node, "st,ahb-addr-masks");

We dont have device_property_xxx for this?

Sorry no. Well didn't figure out one though.

we do :) the array property with NULL argument returns the size of array..

int device_property_read_u32_array(struct device *dev, const char *propname,
				   u32 *val, size_t nval)

Documentation says:
 Return: number of values if @val was %NULL,

quoted

+	if (count < 0)
+		count = 0;
+
+	dmadev = devm_kzalloc(&pdev->dev, sizeof(*dmadev) + sizeof(u32) * count,
+			      GFP_KERNEL);
+	if (!dmadev)
+		return -ENOMEM;
+
+	dmadev->nr_channels = nr_channels;
+	dmadev->nr_requests = nr_requests;
+	of_property_read_u32_array(of_node, "st,ahb-addr-masks",
+				   dmadev->ahb_addr_masks,
+				   count);

i know we have an device api for array reads :)
and I think that helps in former case..

Correct :) device_property_read_u32_array

yes..

-- 
~Vinod

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help