Thread (79 messages) 79 messages, 10 authors, 2021-07-06

Re: [dpdk-dev] [RFC PATCH] dmadev: introduce DMA device library

From: Jerin Jacob <hidden>
Date: 2021-06-18 05:16:38

On Thu, Jun 17, 2021 at 1:30 PM Bruce Richardson
[off-list ref] wrote:
On Thu, Jun 17, 2021 at 01:12:22PM +0530, Jerin Jacob wrote:
quoted
On Thu, Jun 17, 2021 at 12:43 AM Bruce Richardson
[off-list ref] wrote:
quoted
On Wed, Jun 16, 2021 at 11:38:08PM +0530, Jerin Jacob wrote:
quoted
On Wed, Jun 16, 2021 at 11:01 PM Bruce Richardson
[off-list ref] wrote:
quoted
On Wed, Jun 16, 2021 at 05:41:45PM +0800, fengchengwen wrote:
quoted
On 2021/6/16 0:38, Bruce Richardson wrote:
quoted
On Tue, Jun 15, 2021 at 09:22:07PM +0800, Chengwen Feng wrote:
quoted
This patch introduces 'dmadevice' which is a generic type of DMA
device.

The APIs of dmadev library exposes some generic operations which can
enable configuration and I/O with the DMA devices.

Signed-off-by: Chengwen Feng <redacted>
---
Thanks for sending this.

Of most interest to me right now are the key data-plane APIs. While we are
still in the prototyping phase, below is a draft of what we are thinking
for the key enqueue/perform_ops/completed_ops APIs.

Some key differences I note in below vs your original RFC:
* Use of void pointers rather than iova addresses. While using iova's makes
  sense in the general case when using hardware, in that it can work with
  both physical addresses and virtual addresses, if we change the APIs to use
  void pointers instead it will still work for DPDK in VA mode, while at the
  same time allow use of software fallbacks in error cases, and also a stub
  driver than uses memcpy in the background. Finally, using iova's makes the
  APIs a lot more awkward to use with anything but mbufs or similar buffers
  where we already have a pre-computed physical address.
The iova is an hint to application, and widely used in DPDK.
If switch to void, how to pass the address (iova or just va ?)
this may introduce implementation dependencies here.

Or always pass the va, and the driver performs address translation, and this
translation may cost too much cpu I think.
On the latter point, about driver doing address translation I would agree.
However, we probably need more discussion about the use of iova vs just
virtual addresses. My thinking on this is that if we specify the API using
iovas it will severely hurt usability of the API, since it forces the user
to take more inefficient codepaths in a large number of cases. Given a
pointer to the middle of an mbuf, one cannot just pass that straight as an
iova but must instead do a translation into offset from mbuf pointer and
then readd the offset to the mbuf base address.

My preference therefore is to require the use of an IOMMU when using a
dmadev, so that it can be a much closer analog of memcpy. Once an iommu is
present, DPDK will run in VA mode, allowing virtual addresses to our
hugepage memory to be sent directly to hardware. Also, when using
dmadevs on top of an in-kernel driver, that kernel driver may do all iommu
management for the app, removing further the restrictions on what memory
can be addressed by hardware.

One issue of keeping void * is that memory can come from stack or heap .
which HW can not really operate it on.
when kernel driver is managing the IOMMU all process memory can be worked
on, not just hugepage memory, so using iova is wrong in these cases.
But not for stack and heap memory. Right?
Yes, even stack and heap can be accessed.
The HW device cannot as that memory is NOT mapped to IOMMU. It will
result in the transaction
fault.

At least, In octeon, DMA HW job descriptor will have a pointer (IOVA)
which will be updated by _HW_
upon copy job completion. That memory can not be from the
heap(malloc()) or stack as those are not
mapped by IOMMU.

quoted
quoted
As I previously said, using iova prevents the creation of a pure software
dummy driver too using memcpy in the background.
Why ? the memory alloced uing rte_alloc/rte_memzone etc can be touched by CPU.
Yes, but it can't be accessed using physical address, so again only VA mode
where iova's are "void *" make sense.
I agree that it should be a physical address. My only concern that
void * does not express
it can not be from stack/heap. If API tells the memory need to
allotted by rte_alloc() or rte_memzone() etc
is fine with me.

or  it may better that. Have separate API to alloc the handle so based
on the driver, it can be
rte_alloc() or malloc(). It can be burst API in slow path to get
number of status pointers
quoted
Thinking more, Since anyway, we need a separate function for knowing
the completion status,
I think, it can be an opaque object as the completion code. Exposing
directly the status may not help
. As the driver needs a "context" or "call" to change the
driver-specific completion code to DPDK completion code.
I'm sorry, I didn't follow this. By completion code, you mean the status of
whether a copy job succeeded/failed?
Yes, the status of job completion.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help