Re: [dpdk-dev] dmadev discussion summary
From: Bruce Richardson <hidden>
Date: 2021-06-28 10:01:06
On Sat, Jun 26, 2021 at 11:59:49AM +0800, fengchengwen wrote:
Hi, all I analyzed the current DPAM DMA driver and drew this summary in conjunction with the previous discussion, and this will as a basis for the V2 implementation. Feedback is welcome, thanks
Fantastic review and summary, many thanks for the work. Some comments inline in API part below, but nothing too major, I hope. /Bruce <snip>
Summary:
1) The dpaa2/octeontx2/Kunpeng are all ARM soc, there may acts as endpoint of
x86 host (e.g. smart NIC), multiple memory transfer requirements may exist,
e.g. local-to-host/local-to-host..., from the point of view of API design,
I think we should adopt a similar 'channel' or 'virt-queue' concept.
2) Whether to create a separate dmadev for each HW-queue? We previously
discussed this, and due HW-queue could indepent management (like
Kunpeng_dma and Intel DSA), we prefer create a separate dmadev for each
HW-queue before. But I'm not sure if that's the case with dpaa. I think
that can be left to the specific driver, no restriction is imposed on the
framework API layer.
3) I think we could setup following abstraction at dmadev device:
------------ ------------
|virt-queue| |virt-queue|
------------ ------------
\ /
\ /
\ /
------------ ------------
| HW-queue | | HW-queue |
------------ ------------
\ /
\ /
\ /
dmadev
4) The driver's ops design (here we only list key points):
[dev_info_get]: mainly return the number of HW-queues
[dev_configure]: nothing important
[queue_setup]: create one virt-queue, has following main parameters:
HW-queue-index: the HW-queue index used
nb_desc: the number of HW descriptors
opaque: driver's specific info
Note1: this API return virt-queue index which will used in later API.
If user want create multiple virt-queue one the same HW-queue,
they could achieved by call queue_setup with the same
HW-queue-index.
Note2: I think it's hard to define queue_setup config paramter, and
also this is control API, so I think it's OK to use opaque
pointer to implement it.I'm not sure opaque pointer will work in practice, so I think we should try and standardize the parameters as much as possible. Since it's a control plane API, using a struct with a superset of parameters may be workable. Let's start with a minimum set and build up from there.
[dma_copy/memset/sg]: all has vq_id input parameter.
Note: I notice dpaa can't support single and sg in one virt-queue, and
I think it's maybe software implement policy other than HW
restriction because virt-queue could share the same HW-queue.Presumably for queues which support sq, the single-enqueue APIs can use a single sg list internally?
Here we use vq_id to tackle different scenario, like local-to-local/
local-to-host and etc.
5) And the dmadev public data-plane API (just prototype):
dma_cookie_t rte_dmadev_memset(dev, vq_id, pattern, dst, len, flags)
-- flags: used as an extended parameter, it could be uint32_tSuggest uint64_t rather than uint32_t to ensure we have expansion room? Otherwise +1
dma_cookie_t rte_dmadev_memcpy(dev, vq_id, src, dst, len, flags)
+1
dma_cookie_t rte_dmadev_memcpy_sg(dev, vq_id, sg, sg_len, flags)
-- sg: struct dma_scatterlist arrayI don't think our drivers will be directly implementing this API, but so long as SG support is listed as a capability flag I'm fine with this as an API. [We can't fudge it as a bunch of single copies, because that would cause us to have multiple cookies rather than one]
uint16_t rte_dmadev_completed(dev, vq_id, dma_cookie_t *cookie,
uint16_t nb_cpls, bool *has_error)
-- nb_cpls: indicate max process operations number
-- has_error: indicate if there is an error
-- return value: the number of successful completed operations.
-- example:
1) If there are already 32 completed ops, and 4th is error, and
nb_cpls is 32, then the ret will be 3(because 1/2/3th is OK), and
has_error will be true.
2) If there are already 32 completed ops, and all successful
completed, then the ret will be min(32, nb_cpls), and has_error
will be false.
3) If there are already 32 completed ops, and all failed completed,
then the ret will be 0, and has_error will be true.+1 for this
uint16_t rte_dmadev_completed_status(dev_id, vq_id, dma_cookie_t *cookie,
uint16_t nb_status, uint32_t *status)
-- return value: the number of failed completed operations.
And here I agree with Morten: we should design API which adapts to DPDK
service scenarios. So we don't support some sound-cards DMA, and 2D memory
copy which mainly used in video scenarios.Can I suggest a few adjustments here to the semantics of this API. In future we may have operations which return a status value, e.g. our hardware can support ops like compare equal/not-equal, which means that this API would be meaningful even in case of success. Therefore, I suggest that the return value be changed to allow success also to be returned in the array, and the return value is not the number of failed ops, but the number of ops for which status is being returned. Also for consideration: when trying to implement this in a prototype in our driver, it would be easier if we relax the restriction on the "completed" API so that we can flag has_error when an error is detected rather than guaranteeing to return all elements right up to the error. For example, if we have a burst of packets and one is problematic, it may be easier to flag the error at the start of the burst and then have a few successful entries at the start of the completed_status array. [Separate from this] We should also have a "has_error" or "more_errors" flag on this API too, to indicate when the user can switch back to using the regular "completed" API. This means that apps switch from one API to the other when "has_error" is true, and only switch back when it becomes false again.
6) The dma_cookie_t is signed int type, when <0 it mean error, it's
monotonically increasing base on HW-queue (other than virt-queue). The
driver needs to make sure this because the damdev framework don't manage
the dma_cookie's creation.+1 to this. I think we also should specify that the cookie is guaranteed to wrap at a power of 2 value (UINT16_MAX??). This allows it to be used as an index into a circular buffer just by masking.
7) Because data-plane APIs are not thread-safe, and user could determine
virt-queue to HW-queue's map (at the queue-setup stage), so it is user's
duty to ensure thread-safe.
8) One example:
vq_id = rte_dmadev_queue_setup(dev, config.{HW-queue-index=x, opaque});
if (vq_id < 0) {
// create virt-queue failed
return;
}
// submit memcpy task
cookit = rte_dmadev_memcpy(dev, vq_id, src, dst, len, flags);
if (cookie < 0) {
// submit failed
return;
}
// get complete task
ret = rte_dmadev_completed(dev, vq_id, &cookie, 1, has_error);
if (!has_error && ret == 1) {
// the memcpy successful complete
}+1
9) As octeontx2_dma support sg-list which has many valid buffers in
dpi_dma_buf_ptr_s, it could call the rte_dmadev_memcpy_sg API.
10) As ioat, it could delcare support one HW-queue at dev_configure stage, and
only support create one virt-queue.+1
11) As dpaa2_qdma, I think it could migrate to new framework, but still wait
for dpaa2_qdma guys feedback.
12) About the prototype src/dst parameters of rte_dmadev_memcpy API, we have
two candidates which are iova and void *, how about introduce dma_addr_t
type which could be va or iova ?Many thanks again.