Thread (79 messages) 79 messages, 10 authors, 2021-07-06

Re: [dpdk-dev] dmadev discussion summary

From: fengchengwen <hidden>
Date: 2021-07-06 07:11:25


On 2021/7/5 18:28, Morten Brørup wrote:
quoted
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jerin Jacob
Sent: Sunday, 4 July 2021 09.43

On Sat, Jul 3, 2021 at 5:54 PM Morten Brørup [off-list ref]
wrote:
quoted
quoted
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jerin Jacob
Sent: Saturday, 3 July 2021 11.09

On Sat, Jul 3, 2021 at 2:23 PM Morten Brørup
[off-list ref]
quoted
quoted
wrote:
quoted
quoted
From: fengchengwen [mailto:fengchengwen@huawei.com]
Sent: Saturday, 3 July 2021 02.32

On 2021/7/2 22:57, Morten Brørup wrote:
quoted
quoted
In the DPDK framework, many data-plane API names contain
queues.
quoted
quoted
quoted
quoted
e.g.
quoted
quoted
eventdev/crypto..
The concept of virt queues has continuity.
I was also wondering about the name "virtual queue".

Usually, something "virtual" would be an abstraction of
something
quoted
quoted
quoted
quoted
physical, e.g. a software layer on top of something physical.
quoted
Back in the days, a "DMA channel" used to mean a DMA engine
on a
quoted
quoted
CPU.
quoted
quoted
If a CPU had 2 DMA channels, they could both be set up
simultaneously.
quoted
quoted
quoted
The current design has the "dmadev" representing a CPU or
other
quoted
quoted
chip,
quoted
quoted
which has one or more "HW-queues" representing DMA channels (of
the
quoted
quoted
quoted
quoted
same type), and then "virt-queue" as a software abstraction on
top,
quoted
quoted
for
quoted
quoted
using a DMA channel in different ways through individually
configured
quoted
quoted
contexts (virt-queues).
quoted
It makes sense to me, although I would consider renaming "HW-
queue"
quoted
quoted
to "channel" and perhaps "virt-queue" to "queue".

The 'DMA channel' is more used than 'DMA queue', at least
google
quoted
quoted
show
quoted
quoted
that there are at least 20+ times more.

It's a good idea build the abstraction layer: queue <> channel
<>
quoted
quoted
dma-
quoted
quoted
controller.
In this way, the meaning of each layer is relatively easy to
distinguish literally.

will fix in V2
After re-reading all the mails in this thread, I have found one
more
quoted
quoted
important high level detail still not decided:
quoted
Bruce had suggested flattening the DMA channels, so each dmadev
represents a DMA channel. And DMA controllers with multiple DMA
channels will have to instantiate multiple dmadevs, one for each
DMA
quoted
quoted
channel.
quoted
Just like a four port NIC instantiates four ethdevs.

Then, like ethdevs, there would only be two abstraction layers:
dmadev <> queue, where a dmadev is a DMA channel on a DMA
controller.
quoted
quoted
quoted
However, this assumes that the fast path functions on the
individual
quoted
quoted
DMA channels of a DMA controller can be accessed completely
independently and simultaneously by multiple threads. (Otherwise,
the
quoted
quoted
driver would need to implement critical regions or locking around
accessing the common registers in the DMA controller shared by the
DMA
quoted
quoted
channels.)
quoted
Unless any of the DMA controller vendors claim that this
assumption
quoted
quoted
about independence of the DMA channels is wrong, I strongly support
Bruce's flattening suggestion.

It is wrong from alteast octeontx2_dma PoV.

# The PCI device is DMA controller where the driver/device is
mapped.(As device driver is based on PCI bus, We dont want to have
vdev for this)
# The PCI device has HW queue(s)
# Each HW queue has different channels.

In the current configuration, we have only one queue per device and
it
quoted
quoted
has 4 channels. 4 channels are not threaded safe as it is based on
single queue.
Please clarify "current configuration": Is that a configuration
modifiable by changing some software/driver, or is it the chip that was
built that way in the RTL code?

We have 8 queues per SoC, Based on some of HW versions it can be
configured as (a) or (b) using FW settings.
a) One PCI devices with 8 Queues
b) 8 PCI devices with each one has one queue.

Everyone is using mode (b) as it helps 8 different applications to use
DMA as if one application binds the PCI device other applications can
not use the same PCI device.
If one application needs 8 queues, it is possible that 8 dmadevice can
be bound to a single application with mode (b).


I think, in above way we can flatten to <device> <> <channel/queue>
I just look at dpaa2_qdma driver code, and found it seems OK to setup
one xxxdev for one queue.
quoted
quoted
quoted
I think, if we need to flatten it, I think, it makes sense to have
dmadev <> channel (and each channel can have thread-safe capability
based on how it mapped on HW queues based on the device driver
capability).
The key question is how many threads can independently call data-
plane dmadev functions (rte_dma_copy() etc.) simultaneously. If I
understand your explanation correctly, only one - because you only have
one DMA device, and all access to it goes through a single hardware
queue.
quoted
I just realized that although you only have one DMA Controller with
only one HW queue, your four DMA channels allows four sequentially
initiated transactions to be running simultaneously. Does the
application have any benefit by knowing that the dmadev can have
multiple ongoing transactions, or can the fast-path dmadev API hide
that ability?

In my view it is better to hide and I have similar proposal at
http://mails.dpdk.org/archives/dev/2021-July/213141.html
--------------
quoted
  7) Because data-plane APIs are not thread-safe, and user could
determine
quoted
     virt-queue to HW-queue's map (at the queue-setup stage), so it
is user's
quoted
     duty to ensure thread-safe.
+1. But I am not sure how easy for the fast-path application to have
this logic,
Instead, I think, it is better to tell the capa for queue by driver
and in channel configuration,
the application can request for requirement (Is multiple producers enq
to the same HW queue or not).
Based on the request, the implementation can pick the correct function
pointer for enq.(lock vs lockless version if HW does not support
lockless)
+1 to that!
add in channel configuration sound good.
quoted
------------------------
quoted
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help