Thread (339 messages) 339 messages, 17 authors, 2021-10-17

Re: [dpdk-dev] [PATCH] dmadev: introduce DMA device library

From: Jerin Jacob <hidden>
Date: 2021-07-08 18:36:11

On Thu, Jul 8, 2021 at 8:41 AM fengchengwen [off-list ref] wrote:
quoted
quoted
quoted
It's just more conditionals and branches all through the code. Inside the
user application, the user has to check whether to set the flag or not (or
special-case the last transaction outside the loop), and within the driver,
there has to be a branch whether or not to call the doorbell function. The
code on both sides is far simpler and more readable if the doorbell
function is exactly that - a separate function.
I disagree. The reason is:

We will have two classes of applications

a) do dma copy request as and when it has data(I think, this is the
prime use case), for those,
I think, it is considerable overhead to have two function invocation
per transfer i.e
rte_dma_copy() and rte_dma_perform()

b) do dma copy when the data is reached to a logical state,  like copy
IP frame from Ethernet packets or so,
In that case, the application will have  a LOGIC to detect when to
perform it so on the end of
that rte_dma_copy() flag can be updated to fire the doorbell.

IMO, We are comparing against a branch(flag is already in register) vs
a set of instructions for
1) function pointer overhead
2) Need to use the channel context again back in another function.

IMO, a single branch is most optimal from performance PoV.
Ok, let's try it and see how it goes.
Test result show:
1) For Kunpeng platform (ARMv8) could benefit very little with doorbell in flags
2) For Xeon E5-2690 v2 (X86) could benefit with separate function
3) Both platform could benefit with doorbell in flags if burst < 5

There is a performance gain in small bursts (<5). Given the extensive use of bursts
 in DPDK applications and users are accustomed to the concept, I do
not recommend
using the 'doorbell' in flags.
There is NO concept change between one option vs other option. Just
argument differnet.
Also, _perform() scheme not used anywhere in DPDK. I

Regarding performance, I have added dummy instructions to simulate the real work
load[1], now burst also has some gain in both x86 and arm64[3]

I have modified your application[2] to dpdk test application to use
cpu isolation etc.
So this is gain in flag scheme ad code is checked in to Github[2[

[1]
static inline void
delay(void)
{
        volatile int k;

        for (k = 0; k < 16; k++) {


      }

}

__rte_noinline
int
hisi_dma_enqueue(struct dmadev *dev, int vchan, void *src, void *dst,
int len, const int flags)
{
         delay();

        *ring = 1;

         return 0;
}

__rte_noinline
int
hisi_dma_enqueue_doorbell(struct dmadev *dev, int vchan, void *src,
void *dst, int len, const int flags)
{
        delay();

        *ring = 1;

        if (unlikely(flags == 1)) {
                rte_wmb();
                *doorbell = 1;
        }
      return 0;
}


[2]

https://github.com/jerinjacobk/dpdk-dmatest/commit/4fc9bc3029543bbc4caaa5183d98bac93c34f588

Update results [3]

Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz

echo "dma_perf_autotest" | ./build/app/test/dpdk-test --no-huge -c 0xf00

core=24 Timer running at 2600.00MHz
   test_for_perform_after_multiple_enqueue: burst=1 cycles=46.000000
      test_for_last_enqueue_issue_doorbell: burst=1 cycles=45.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=2 cycles=90.000000
      test_for_last_enqueue_issue_doorbell: burst=2 cycles=89.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=3 cycles=134.000000
      test_for_last_enqueue_issue_doorbell: burst=3 cycles=133.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=4 cycles=177.000000
      test_for_last_enqueue_issue_doorbell: burst=4 cycles=176.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=5 cycles=221.000000
      test_for_last_enqueue_issue_doorbell: burst=5 cycles=221.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=6 cycles=265.000000
      test_for_last_enqueue_issue_doorbell: burst=6 cycles=265.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=7 cycles=333.000000
      test_for_last_enqueue_issue_doorbell: burst=7 cycles=309.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=8 cycles=375.000000
      test_for_last_enqueue_issue_doorbell: burst=8 cycles=373.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=9 cycles=418.000000
      test_for_last_enqueue_issue_doorbell: burst=9 cycles=414.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=10 cycles=462.000000
      test_for_last_enqueue_issue_doorbell: burst=10 cycles=458.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=11 cycles=507.000000
      test_for_last_enqueue_issue_doorbell: burst=11 cycles=501.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=12 cycles=552.000000
      test_for_last_enqueue_issue_doorbell: burst=12 cycles=546.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=13 cycles=593.000000
      test_for_last_enqueue_issue_doorbell: burst=13 cycles=590.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=14 cycles=638.000000
      test_for_last_enqueue_issue_doorbell: burst=14 cycles=634.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=15 cycles=681.000000
      test_for_last_enqueue_issue_doorbell: burst=15 cycles=678.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=16 cycles=725.000000
      test_for_last_enqueue_issue_doorbell: burst=16 cycles=722.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=17 cycles=770.000000
      test_for_last_enqueue_issue_doorbell: burst=17 cycles=767.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=18 cycles=815.000000
      test_for_last_enqueue_issue_doorbell: burst=18 cycles=812.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=19 cycles=857.000000
      test_for_last_enqueue_issue_doorbell: burst=19 cycles=854.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=20 cycles=902.000000
      test_for_last_enqueue_issue_doorbell: burst=20 cycles=899.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=21 cycles=945.000000
      test_for_last_enqueue_issue_doorbell: burst=21 cycles=943.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=22 cycles=990.000000
      test_for_last_enqueue_issue_doorbell: burst=22 cycles=988.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=23 cycles=1033.000000
      test_for_last_enqueue_issue_doorbell: burst=23 cycles=1031.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=24 cycles=1077.000000
      test_for_last_enqueue_issue_doorbell: burst=24 cycles=1075.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=25 cycles=1121.000000
      test_for_last_enqueue_issue_doorbell: burst=25 cycles=1119.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=26 cycles=1166.000000
      test_for_last_enqueue_issue_doorbell: burst=26 cycles=1163.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=27 cycles=1208.000000
      test_for_last_enqueue_issue_doorbell: burst=27 cycles=1208.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=28 cycles=1252.000000
      test_for_last_enqueue_issue_doorbell: burst=28 cycles=1252.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=29 cycles=1295.000000
      test_for_last_enqueue_issue_doorbell: burst=29 cycles=1295.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=30 cycles=1342.000000
      test_for_last_enqueue_issue_doorbell: burst=30 cycles=1340.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=31 cycles=1386.000000
      test_for_last_enqueue_issue_doorbell: burst=31 cycles=1384.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=32 cycles=1429.000000
      test_for_last_enqueue_issue_doorbell: burst=32 cycles=1428.000000
-------------------------------------------------------------------------------



octeontx2:

See https://doc.dpdk.org/guides/prog_guide/profile_app.html section
62.2.3. High-resolution cycle counter


meson --cross config/arm/arm64_octeontx2_linux_gcc
-Dc_args='-DRTE_ARM_EAL_RDTSC_USE_PMU' build

 echo "dma_perf_autotest" | ./build/app/test/dpdk-test --no-huge -c 0xff0000
RTE>>dma_perf_autotest^M
lcore=16 Timer running at 2400.00MHz
   test_for_perform_after_multiple_enqueue: burst=1 cycles=105.000000
      test_for_last_enqueue_issue_doorbell: burst=1 cycles=105.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=2 cycles=207.000000
      test_for_last_enqueue_issue_doorbell: burst=2 cycles=207.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=3 cycles=309.000000
      test_for_last_enqueue_issue_doorbell: burst=3 cycles=310.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=4 cycles=411.000000
      test_for_last_enqueue_issue_doorbell: burst=4 cycles=410.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=5 cycles=513.000000
      test_for_last_enqueue_issue_doorbell: burst=5 cycles=512.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=6 cycles=615.000000
      test_for_last_enqueue_issue_doorbell: burst=6 cycles=615.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=7 cycles=717.000000
      test_for_last_enqueue_issue_doorbell: burst=7 cycles=716.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=8 cycles=819.000000
      test_for_last_enqueue_issue_doorbell: burst=8 cycles=818.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=9 cycles=921.000000
      test_for_last_enqueue_issue_doorbell: burst=9 cycles=922.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=10 cycles=1023.000000
      test_for_last_enqueue_issue_doorbell: burst=10 cycles=1022.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=11 cycles=1126.000000
      test_for_last_enqueue_issue_doorbell: burst=11 cycles=1124.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=12 cycles=1227.000000
      test_for_last_enqueue_issue_doorbell: burst=12 cycles=1227.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=13 cycles=1329.000000
      test_for_last_enqueue_issue_doorbell: burst=13 cycles=1328.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=14 cycles=1431.000000
      test_for_last_enqueue_issue_doorbell: burst=14 cycles=1430.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=15 cycles=1534.000000
      test_for_last_enqueue_issue_doorbell: burst=15 cycles=1534.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=16 cycles=1638.000000
      test_for_last_enqueue_issue_doorbell: burst=16 cycles=1640.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=17 cycles=1746.000000
      test_for_last_enqueue_issue_doorbell: burst=17 cycles=1739.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=18 cycles=1847.000000
      test_for_last_enqueue_issue_doorbell: burst=18 cycles=1841.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=19 cycles=1950.000000
      test_for_last_enqueue_issue_doorbell: burst=19 cycles=1944.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=20 cycles=2051.000000
       test_for_last_enqueue_issue_doorbell: burst=20 cycles=2045.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=21 cycles=2154.000000
      test_for_last_enqueue_issue_doorbell: burst=21 cycles=2148.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=22 cycles=2257.000000
      test_for_last_enqueue_issue_doorbell: burst=22 cycles=2249.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=23 cycles=2358.000000
      test_for_last_enqueue_issue_doorbell: burst=23 cycles=2352.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=24 cycles=2459.000000
      test_for_last_enqueue_issue_doorbell: burst=24 cycles=2454.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=25 cycles=2562.000000
      test_for_last_enqueue_issue_doorbell: burst=25 cycles=2555.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=26 cycles=2665.000000
      test_for_last_enqueue_issue_doorbell: burst=26 cycles=2657.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=27 cycles=2766.000000
      test_for_last_enqueue_issue_doorbell: burst=27 cycles=2760.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=28 cycles=2867.000000
      test_for_last_enqueue_issue_doorbell: burst=28 cycles=2861.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=29 cycles=2970.000000
      test_for_last_enqueue_issue_doorbell: burst=29 cycles=2964.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=30 cycles=3073.000000
      test_for_last_enqueue_issue_doorbell: burst=30 cycles=3065.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=31 cycles=3174.000000
      test_for_last_enqueue_issue_doorbell: burst=31 cycles=3168.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=32 cycles=3275.000000
      test_for_last_enqueue_issue_doorbell: burst=32 cycles=3269.000000
-------------------------------------------------------------------------------
Test OK
RTE>



And also user may confuse about the doorbell operations.

Kunpeng platform test result:
    [root@SZ tmp]# ./a1 1
    burst = 1
    perform_after_multiple_enqueue: burst:1 cost:0s.554422us
    doorbell_for_every_enqueue: burst:1 cost:0s.450927us
    last_enqueue_issue_doorbell: burst:1 cost:0s.450479us
    [root@SZ tmp]#
    [root@SZ tmp]# ./a1 2
    burst = 2
    perform_after_multiple_enqueue: burst:2 cost:0s.900884us
    doorbell_for_every_enqueue: burst:2 cost:0s.866732us
    last_enqueue_issue_doorbell: burst:2 cost:0s.732469us
    [root@SZ tmp]# ./a1 5
    burst = 5
    perform_after_multiple_enqueue: burst:5 cost:1s.732410us
    doorbell_for_every_enqueue: burst:5 cost:2s.115479us
    last_enqueue_issue_doorbell: burst:5 cost:1s.759349us
    [root@SZ tmp]# ./a1 10
    burst = 10
    perform_after_multiple_enqueue: burst:10 cost:3s.490716us
    doorbell_for_every_enqueue: burst:10 cost:4s.194691us
    last_enqueue_issue_doorbell: burst:10 cost:3s.331825us
    [root@SZ tmp]# ./a1 30
    burst = 30
    perform_after_multiple_enqueue: burst:30 cost:9s.61761us
    doorbell_for_every_enqueue: burst:30 cost:12s.517082us
    last_enqueue_issue_doorbell: burst:30 cost:9s.614802us

X86 platform test result:
    fengchengwen@SZ:~/tmp$ ./a1 1
    burst = 1
    perform_after_multiple_enqueue: burst:1 cost:0s.406331us
    doorbell_for_every_enqueue: burst:1 cost:0s.331109us
    last_enqueue_issue_doorbell: burst:1 cost:0s.381782us
    fengchengwen@SZ:~/tmp$ ./a1 2
    burst = 2
    perform_after_multiple_enqueue: burst:2 cost:0s.569024us
    doorbell_for_every_enqueue: burst:2 cost:0s.643449us
    last_enqueue_issue_doorbell: burst:2 cost:0s.486639us
    fengchengwen@SZ:~/tmp$ ./a1 5
    burst = 5
    perform_after_multiple_enqueue: burst:5 cost:1s.166384us
    doorbell_for_every_enqueue: burst:5 cost:1s.602369us
    last_enqueue_issue_doorbell: burst:5 cost:1s.209392us
    fengchengwen@SZ:~/tmp$ ./a1 10
    burst = 10
    perform_after_multiple_enqueue: burst:10 cost:2s.229901us
    doorbell_for_every_enqueue: burst:10 cost:3s.754802us
    last_enqueue_issue_doorbell: burst:10 cost:2s.328705us
    fengchengwen@SZ:~/tmp$
    fengchengwen@SZ:~/tmp$ ./a1 30
    burst = 30
    perform_after_multiple_enqueue: burst:30 cost:6s.132817us
    doorbell_for_every_enqueue: burst:30 cost:9s.944619us
    last_enqueue_issue_doorbell: burst:30 cost:7s.73551us


test-code:

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <time.h>
#include <sys/time.h>

struct dmadev;

unsigned int dev_reg[10240];
volatile unsigned int *ring;
volatile unsigned int *doorbell;

void init_global(void)
{
        ring = &dev_reg[100];
        doorbell = &dev_reg[10000];
}

#define rte_wmb() asm volatile("dmb oshst" : : : "memory")
//#define rte_wmb() asm volatile ("" : : : "memory")

typedef int (*enqueue_t)(struct dmadev *dev, int vchan, void *src, void *dst, int len, int flags);
typedef void (*perform_t)(struct dmadev *dev, int vchan);

struct dmadev {
        enqueue_t enqueue;
        perform_t perform;
        char rsv[512];
};

int hisi_dma_enqueue(struct dmadev *dev, int vchan, void *src, void *dst, int len, int flags)
{
        *ring = 1;
}

int hisi_dma_enqueue_doorbell(struct dmadev *dev, int vchan, void *src, void *dst, int len, int flags)
{
        *ring = 1;
        if (flags == 1) {
                rte_wmb();
                *doorbell = 1;
        }
}

void hisi_dma_perform(struct dmadev *dev, int vchan)
{
        rte_wmb();
        *doorbell = 1;
}

struct dmadev devlist[64];

void init_devlist(bool enq_doorbell)
{
        int i;
        for (i = 0; i < 64; i++) {
                devlist[i].enqueue = enq_doorbell ? hisi_dma_enqueue_doorbell : hisi_dma_enqueue;
                devlist[i].perform = hisi_dma_perform;
        }
}

static inline int dma_enqueue(int dev_id, int vchan, void *src, void *dst, int len, int flags)
{
        struct dmadev *dev = &devlist[dev_id];
        return dev->enqueue(dev, vchan, src, dst, len, flags);
}

static inline void dma_perform(int dev_id, int vchan)
{
        struct dmadev *dev = &devlist[dev_id];
        return dev->perform(dev, vchan);
}

#define MAX_LOOPS       90000000

void test_for_perform_after_multiple_enqueue(int burst)
{
        struct timeval start, end, delta;
        unsigned int i, j;
        init_devlist(false);
        gettimeofday(&start, NULL);
        for (i = 0; i < MAX_LOOPS; i++) {
                for (j = 0; j < burst; j++)
                        (void)dma_enqueue(10, 0, NULL, NULL, 0, 0);
                dma_perform(10, 0);
        }
        gettimeofday(&end, NULL);
        timersub(&end, &start, &delta);
        printf("perform_after_multiple_enqueue: burst:%d cost:%us.%uus \n", burst, delta.tv_sec, delta.tv_usec);
}

void test_for_doorbell_for_every_enqueue(int burst)
{
        struct timeval start, end, delta;
        unsigned int i, j;
        init_devlist(true);
        gettimeofday(&start, NULL);
        for (i = 0; i < MAX_LOOPS; i++) {
                for (j = 0; j < burst; j++)
                        (void)dma_enqueue(10, 0, NULL, NULL, 0, 1);
        }
        gettimeofday(&end, NULL);
        timersub(&end, &start, &delta);
        printf("doorbell_for_every_enqueue: burst:%d cost:%us.%uus \n", burst, delta.tv_sec, delta.tv_usec);
}

void test_for_last_enqueue_issue_doorbell(int burst)
{
        struct timeval start, end, delta;
        unsigned int i, j;
        init_devlist(true);
        gettimeofday(&start, NULL);
        for (i = 0; i < MAX_LOOPS; i++) {
                for (j = 0; j < burst - 1; j++)
                        (void)dma_enqueue(10, 0, NULL, NULL, 0, 0);
                dma_enqueue(10, 0, NULL, NULL, 0, 1);
        }
        gettimeofday(&end, NULL);
        timersub(&end, &start, &delta);
        printf("last_enqueue_issue_doorbell: burst:%d cost:%us.%uus \n", burst, delta.tv_sec, delta.tv_usec);
}

void main(int argc, char *argv[])
{
        if (argc < 2) {
                printf("please input burst parameter!\n");
                return;
        }
        init_global();
        int burst = atol(argv[1]);
        printf("burst = %d \n", burst);
        test_for_perform_after_multiple_enqueue(burst);
        test_for_doorbell_for_every_enqueue(burst);
        test_for_last_enqueue_issue_doorbell(burst);
}
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
<snip>
quoted
quoted
quoted
quoted
+ +/** + * @warning + * @b EXPERIMENTAL: this API may change
without prior notice.  + * + * Returns the number of operations
that failed to complete.  + * NOTE: This API was used when
rte_dmadev_completed has_error was set.  + * + * @param dev_id
+ *   The identifier of the device.  + * @param vq_id + *   The
identifier of virt queue.
(> + * @param nb_status
quoted
+ *   Indicates the size  of status array.  + * @param[out]
status + *   The error code of operations that failed to
complete.  + * @param[out] cookie + *   The last failed
completed operation's cookie.  + * + * @return + *   The number
of operations that failed to complete.  + * + * NOTE: The
caller must ensure that the input parameter is valid and the +
*       corresponding device supports the operation.  + */
+__rte_experimental +static inline uint16_t
+rte_dmadev_completed_fails(uint16_t dev_id, uint16_t vq_id, +
const uint16_t nb_status, uint32_t *status, +
dma_cookie_t *cookie)
IMO, it is better to move cookie/rind_idx at 3.  Why it would
return any array of errors? since it called after
rte_dmadev_completed() has has_error. Is it better to change

rte_dmadev_error_status((uint16_t dev_id, uint16_t vq_id,
dma_cookie_t *cookie,  uint32_t *status)

I also think, we may need to set status as bitmask and enumerate
all the combination of error codes of all the driver and return
string from driver existing rte_flow_error

See struct rte_flow_error { enum rte_flow_error_type type; /**<
Cause field and error types. */ const void *cause; /**< Object
responsible for the error. */ const char *message; /**<
Human-readable error message. */ };
I think we need a multi-return value API here, as we may add
operations in future which have non-error status values to return.
The obvious case is DMA engines which support "compare" operations.
In that case a successful compare (as in there were no DMA or HW
errors) can return "equal" or "not-equal" as statuses. For general
"copy" operations, the faster completion op can be used to just
return successful values (and only call this status version on
error), while apps using those compare ops or a mixture of copy and
compare ops, would always use the slower one that returns status
values for each and every op..

The ioat APIs used 32-bit integer values for this status array so
as to allow e.g. 16-bits for error code and 16-bits for future
status values. For most operations there should be a fairly small
set of things that can go wrong, i.e. bad source address, bad
destination address or invalid length.  Within that we may have a
couple of specifics for why an address is bad, but even so I don't
think we need to start having multiple bit combinations.
OK. What is the purpose of errors status? Is it for application
printing it or Does the application need to take any action based on
specific error requests?
It's largely for information purposes, but in the case of SVA/SVM
errors could occur due to the memory not being pinned, i.e. a page
fault, in some cases. If that happens, then it's up the app to either
touch the memory and retry the copy, or to do a SW memcpy as a
fallback.

In other error cases, I think it's good to tell the application if it's
passing around bad data, or data that is beyond the scope of hardware,
e.g.  a copy that is beyond what can be done in a single transaction
for a HW instance. Given that there are always things that can go
wrong, I think we need some error reporting mechanism.
quoted
If the former is scope, then we need to define the standard enum
value for the error right?  ie. uint32_t *status needs to change to
enum rte_dma_error or so.
Sure. Perhaps an error/status structure either is an option, where we
explicitly call out error info from status info.
Agree. Better to have a structure with filed like,

1)  enum rte_dma_error_type 2)  memory to store, informative message on
fine aspects of error.  LIke address caused issue etc.(Which will be
driver-specific information).
The only issue I have with that is that once we have driver specific
information it is of little use to the application, since it can't know
anything about it excepy maybe log it.  I'd much rather have a set of error
codes telling user that "source address is wrong", "dest address is wrong",
and a generic "an address is wrong" in case driver/HW cannot distinguish
source of error. Can we see how far we get with just error codes before we
start into passing string messages around and all the memory management
headaches that implies.
Works for me. It should be "enum rte_dma_error_type" then, which has a standard
error type. Which is missing in the spec now.
+1
.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help