Re: [dpdk-dev] [PATCH] dmadev: introduce DMA device library
From: Jerin Jacob <hidden>
Date: 2021-07-08 18:36:11
On Thu, Jul 8, 2021 at 8:41 AM fengchengwen [off-list ref] wrote:
quoted
quoted
quoted
It's just more conditionals and branches all through the code. Inside the user application, the user has to check whether to set the flag or not (or special-case the last transaction outside the loop), and within the driver, there has to be a branch whether or not to call the doorbell function. The code on both sides is far simpler and more readable if the doorbell function is exactly that - a separate function.I disagree. The reason is: We will have two classes of applications a) do dma copy request as and when it has data(I think, this is the prime use case), for those, I think, it is considerable overhead to have two function invocation per transfer i.e rte_dma_copy() and rte_dma_perform() b) do dma copy when the data is reached to a logical state, like copy IP frame from Ethernet packets or so, In that case, the application will have a LOGIC to detect when to perform it so on the end of that rte_dma_copy() flag can be updated to fire the doorbell. IMO, We are comparing against a branch(flag is already in register) vs a set of instructions for 1) function pointer overhead 2) Need to use the channel context again back in another function. IMO, a single branch is most optimal from performance PoV.Ok, let's try it and see how it goes.Test result show: 1) For Kunpeng platform (ARMv8) could benefit very little with doorbell in flags 2) For Xeon E5-2690 v2 (X86) could benefit with separate function 3) Both platform could benefit with doorbell in flags if burst < 5 There is a performance gain in small bursts (<5). Given the extensive use of bursts
in DPDK applications and users are accustomed to the concept, I do not recommend
using the 'doorbell' in flags.
There is NO concept change between one option vs other option. Just
argument differnet.
Also, _perform() scheme not used anywhere in DPDK. I
Regarding performance, I have added dummy instructions to simulate the real work
load[1], now burst also has some gain in both x86 and arm64[3]
I have modified your application[2] to dpdk test application to use
cpu isolation etc.
So this is gain in flag scheme ad code is checked in to Github[2[
[1]
static inline void
delay(void)
{
volatile int k;
for (k = 0; k < 16; k++) {
}
}
__rte_noinline
int
hisi_dma_enqueue(struct dmadev *dev, int vchan, void *src, void *dst,
int len, const int flags)
{
delay();
*ring = 1;
return 0;
}
__rte_noinline
int
hisi_dma_enqueue_doorbell(struct dmadev *dev, int vchan, void *src,
void *dst, int len, const int flags)
{
delay();
*ring = 1;
if (unlikely(flags == 1)) {
rte_wmb();
*doorbell = 1;
}
return 0;
}
[2]
https://github.com/jerinjacobk/dpdk-dmatest/commit/4fc9bc3029543bbc4caaa5183d98bac93c34f588
Update results [3]
Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
echo "dma_perf_autotest" | ./build/app/test/dpdk-test --no-huge -c 0xf00
core=24 Timer running at 2600.00MHz
test_for_perform_after_multiple_enqueue: burst=1 cycles=46.000000
test_for_last_enqueue_issue_doorbell: burst=1 cycles=45.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=2 cycles=90.000000
test_for_last_enqueue_issue_doorbell: burst=2 cycles=89.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=3 cycles=134.000000
test_for_last_enqueue_issue_doorbell: burst=3 cycles=133.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=4 cycles=177.000000
test_for_last_enqueue_issue_doorbell: burst=4 cycles=176.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=5 cycles=221.000000
test_for_last_enqueue_issue_doorbell: burst=5 cycles=221.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=6 cycles=265.000000
test_for_last_enqueue_issue_doorbell: burst=6 cycles=265.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=7 cycles=333.000000
test_for_last_enqueue_issue_doorbell: burst=7 cycles=309.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=8 cycles=375.000000
test_for_last_enqueue_issue_doorbell: burst=8 cycles=373.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=9 cycles=418.000000
test_for_last_enqueue_issue_doorbell: burst=9 cycles=414.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=10 cycles=462.000000
test_for_last_enqueue_issue_doorbell: burst=10 cycles=458.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=11 cycles=507.000000
test_for_last_enqueue_issue_doorbell: burst=11 cycles=501.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=12 cycles=552.000000
test_for_last_enqueue_issue_doorbell: burst=12 cycles=546.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=13 cycles=593.000000
test_for_last_enqueue_issue_doorbell: burst=13 cycles=590.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=14 cycles=638.000000
test_for_last_enqueue_issue_doorbell: burst=14 cycles=634.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=15 cycles=681.000000
test_for_last_enqueue_issue_doorbell: burst=15 cycles=678.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=16 cycles=725.000000
test_for_last_enqueue_issue_doorbell: burst=16 cycles=722.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=17 cycles=770.000000
test_for_last_enqueue_issue_doorbell: burst=17 cycles=767.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=18 cycles=815.000000
test_for_last_enqueue_issue_doorbell: burst=18 cycles=812.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=19 cycles=857.000000
test_for_last_enqueue_issue_doorbell: burst=19 cycles=854.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=20 cycles=902.000000
test_for_last_enqueue_issue_doorbell: burst=20 cycles=899.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=21 cycles=945.000000
test_for_last_enqueue_issue_doorbell: burst=21 cycles=943.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=22 cycles=990.000000
test_for_last_enqueue_issue_doorbell: burst=22 cycles=988.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=23 cycles=1033.000000
test_for_last_enqueue_issue_doorbell: burst=23 cycles=1031.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=24 cycles=1077.000000
test_for_last_enqueue_issue_doorbell: burst=24 cycles=1075.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=25 cycles=1121.000000
test_for_last_enqueue_issue_doorbell: burst=25 cycles=1119.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=26 cycles=1166.000000
test_for_last_enqueue_issue_doorbell: burst=26 cycles=1163.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=27 cycles=1208.000000
test_for_last_enqueue_issue_doorbell: burst=27 cycles=1208.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=28 cycles=1252.000000
test_for_last_enqueue_issue_doorbell: burst=28 cycles=1252.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=29 cycles=1295.000000
test_for_last_enqueue_issue_doorbell: burst=29 cycles=1295.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=30 cycles=1342.000000
test_for_last_enqueue_issue_doorbell: burst=30 cycles=1340.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=31 cycles=1386.000000
test_for_last_enqueue_issue_doorbell: burst=31 cycles=1384.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=32 cycles=1429.000000
test_for_last_enqueue_issue_doorbell: burst=32 cycles=1428.000000
-------------------------------------------------------------------------------
octeontx2:
See https://doc.dpdk.org/guides/prog_guide/profile_app.html section
62.2.3. High-resolution cycle counter
meson --cross config/arm/arm64_octeontx2_linux_gcc
-Dc_args='-DRTE_ARM_EAL_RDTSC_USE_PMU' build
echo "dma_perf_autotest" | ./build/app/test/dpdk-test --no-huge -c 0xff0000
RTE>>dma_perf_autotest^M
lcore=16 Timer running at 2400.00MHz
test_for_perform_after_multiple_enqueue: burst=1 cycles=105.000000
test_for_last_enqueue_issue_doorbell: burst=1 cycles=105.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=2 cycles=207.000000
test_for_last_enqueue_issue_doorbell: burst=2 cycles=207.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=3 cycles=309.000000
test_for_last_enqueue_issue_doorbell: burst=3 cycles=310.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=4 cycles=411.000000
test_for_last_enqueue_issue_doorbell: burst=4 cycles=410.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=5 cycles=513.000000
test_for_last_enqueue_issue_doorbell: burst=5 cycles=512.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=6 cycles=615.000000
test_for_last_enqueue_issue_doorbell: burst=6 cycles=615.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=7 cycles=717.000000
test_for_last_enqueue_issue_doorbell: burst=7 cycles=716.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=8 cycles=819.000000
test_for_last_enqueue_issue_doorbell: burst=8 cycles=818.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=9 cycles=921.000000
test_for_last_enqueue_issue_doorbell: burst=9 cycles=922.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=10 cycles=1023.000000
test_for_last_enqueue_issue_doorbell: burst=10 cycles=1022.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=11 cycles=1126.000000
test_for_last_enqueue_issue_doorbell: burst=11 cycles=1124.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=12 cycles=1227.000000
test_for_last_enqueue_issue_doorbell: burst=12 cycles=1227.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=13 cycles=1329.000000
test_for_last_enqueue_issue_doorbell: burst=13 cycles=1328.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=14 cycles=1431.000000
test_for_last_enqueue_issue_doorbell: burst=14 cycles=1430.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=15 cycles=1534.000000
test_for_last_enqueue_issue_doorbell: burst=15 cycles=1534.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=16 cycles=1638.000000
test_for_last_enqueue_issue_doorbell: burst=16 cycles=1640.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=17 cycles=1746.000000
test_for_last_enqueue_issue_doorbell: burst=17 cycles=1739.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=18 cycles=1847.000000
test_for_last_enqueue_issue_doorbell: burst=18 cycles=1841.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=19 cycles=1950.000000
test_for_last_enqueue_issue_doorbell: burst=19 cycles=1944.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=20 cycles=2051.000000
test_for_last_enqueue_issue_doorbell: burst=20 cycles=2045.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=21 cycles=2154.000000
test_for_last_enqueue_issue_doorbell: burst=21 cycles=2148.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=22 cycles=2257.000000
test_for_last_enqueue_issue_doorbell: burst=22 cycles=2249.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=23 cycles=2358.000000
test_for_last_enqueue_issue_doorbell: burst=23 cycles=2352.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=24 cycles=2459.000000
test_for_last_enqueue_issue_doorbell: burst=24 cycles=2454.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=25 cycles=2562.000000
test_for_last_enqueue_issue_doorbell: burst=25 cycles=2555.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=26 cycles=2665.000000
test_for_last_enqueue_issue_doorbell: burst=26 cycles=2657.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=27 cycles=2766.000000
test_for_last_enqueue_issue_doorbell: burst=27 cycles=2760.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=28 cycles=2867.000000
test_for_last_enqueue_issue_doorbell: burst=28 cycles=2861.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=29 cycles=2970.000000
test_for_last_enqueue_issue_doorbell: burst=29 cycles=2964.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=30 cycles=3073.000000
test_for_last_enqueue_issue_doorbell: burst=30 cycles=3065.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=31 cycles=3174.000000
test_for_last_enqueue_issue_doorbell: burst=31 cycles=3168.000000
-------------------------------------------------------------------------------
test_for_perform_after_multiple_enqueue: burst=32 cycles=3275.000000
test_for_last_enqueue_issue_doorbell: burst=32 cycles=3269.000000
-------------------------------------------------------------------------------
Test OK
RTE>
And also user may confuse about the doorbell operations. Kunpeng platform test result: [root@SZ tmp]# ./a1 1 burst = 1 perform_after_multiple_enqueue: burst:1 cost:0s.554422us doorbell_for_every_enqueue: burst:1 cost:0s.450927us last_enqueue_issue_doorbell: burst:1 cost:0s.450479us [root@SZ tmp]# [root@SZ tmp]# ./a1 2 burst = 2 perform_after_multiple_enqueue: burst:2 cost:0s.900884us doorbell_for_every_enqueue: burst:2 cost:0s.866732us last_enqueue_issue_doorbell: burst:2 cost:0s.732469us [root@SZ tmp]# ./a1 5 burst = 5 perform_after_multiple_enqueue: burst:5 cost:1s.732410us doorbell_for_every_enqueue: burst:5 cost:2s.115479us last_enqueue_issue_doorbell: burst:5 cost:1s.759349us [root@SZ tmp]# ./a1 10 burst = 10 perform_after_multiple_enqueue: burst:10 cost:3s.490716us doorbell_for_every_enqueue: burst:10 cost:4s.194691us last_enqueue_issue_doorbell: burst:10 cost:3s.331825us [root@SZ tmp]# ./a1 30 burst = 30 perform_after_multiple_enqueue: burst:30 cost:9s.61761us doorbell_for_every_enqueue: burst:30 cost:12s.517082us last_enqueue_issue_doorbell: burst:30 cost:9s.614802us X86 platform test result: fengchengwen@SZ:~/tmp$ ./a1 1 burst = 1 perform_after_multiple_enqueue: burst:1 cost:0s.406331us doorbell_for_every_enqueue: burst:1 cost:0s.331109us last_enqueue_issue_doorbell: burst:1 cost:0s.381782us fengchengwen@SZ:~/tmp$ ./a1 2 burst = 2 perform_after_multiple_enqueue: burst:2 cost:0s.569024us doorbell_for_every_enqueue: burst:2 cost:0s.643449us last_enqueue_issue_doorbell: burst:2 cost:0s.486639us fengchengwen@SZ:~/tmp$ ./a1 5 burst = 5 perform_after_multiple_enqueue: burst:5 cost:1s.166384us doorbell_for_every_enqueue: burst:5 cost:1s.602369us last_enqueue_issue_doorbell: burst:5 cost:1s.209392us fengchengwen@SZ:~/tmp$ ./a1 10 burst = 10 perform_after_multiple_enqueue: burst:10 cost:2s.229901us doorbell_for_every_enqueue: burst:10 cost:3s.754802us last_enqueue_issue_doorbell: burst:10 cost:2s.328705us fengchengwen@SZ:~/tmp$ fengchengwen@SZ:~/tmp$ ./a1 30 burst = 30 perform_after_multiple_enqueue: burst:30 cost:6s.132817us doorbell_for_every_enqueue: burst:30 cost:9s.944619us last_enqueue_issue_doorbell: burst:30 cost:7s.73551us test-code: #include <stdio.h> #include <stdlib.h> #include <stdbool.h> #include <time.h> #include <sys/time.h> struct dmadev; unsigned int dev_reg[10240]; volatile unsigned int *ring; volatile unsigned int *doorbell; void init_global(void) { ring = &dev_reg[100]; doorbell = &dev_reg[10000]; } #define rte_wmb() asm volatile("dmb oshst" : : : "memory") //#define rte_wmb() asm volatile ("" : : : "memory") typedef int (*enqueue_t)(struct dmadev *dev, int vchan, void *src, void *dst, int len, int flags); typedef void (*perform_t)(struct dmadev *dev, int vchan); struct dmadev { enqueue_t enqueue; perform_t perform; char rsv[512]; }; int hisi_dma_enqueue(struct dmadev *dev, int vchan, void *src, void *dst, int len, int flags) { *ring = 1; } int hisi_dma_enqueue_doorbell(struct dmadev *dev, int vchan, void *src, void *dst, int len, int flags) { *ring = 1; if (flags == 1) { rte_wmb(); *doorbell = 1; } } void hisi_dma_perform(struct dmadev *dev, int vchan) { rte_wmb(); *doorbell = 1; } struct dmadev devlist[64]; void init_devlist(bool enq_doorbell) { int i; for (i = 0; i < 64; i++) { devlist[i].enqueue = enq_doorbell ? hisi_dma_enqueue_doorbell : hisi_dma_enqueue; devlist[i].perform = hisi_dma_perform; } } static inline int dma_enqueue(int dev_id, int vchan, void *src, void *dst, int len, int flags) { struct dmadev *dev = &devlist[dev_id]; return dev->enqueue(dev, vchan, src, dst, len, flags); } static inline void dma_perform(int dev_id, int vchan) { struct dmadev *dev = &devlist[dev_id]; return dev->perform(dev, vchan); } #define MAX_LOOPS 90000000 void test_for_perform_after_multiple_enqueue(int burst) { struct timeval start, end, delta; unsigned int i, j; init_devlist(false); gettimeofday(&start, NULL); for (i = 0; i < MAX_LOOPS; i++) { for (j = 0; j < burst; j++) (void)dma_enqueue(10, 0, NULL, NULL, 0, 0); dma_perform(10, 0); } gettimeofday(&end, NULL); timersub(&end, &start, &delta); printf("perform_after_multiple_enqueue: burst:%d cost:%us.%uus \n", burst, delta.tv_sec, delta.tv_usec); } void test_for_doorbell_for_every_enqueue(int burst) { struct timeval start, end, delta; unsigned int i, j; init_devlist(true); gettimeofday(&start, NULL); for (i = 0; i < MAX_LOOPS; i++) { for (j = 0; j < burst; j++) (void)dma_enqueue(10, 0, NULL, NULL, 0, 1); } gettimeofday(&end, NULL); timersub(&end, &start, &delta); printf("doorbell_for_every_enqueue: burst:%d cost:%us.%uus \n", burst, delta.tv_sec, delta.tv_usec); } void test_for_last_enqueue_issue_doorbell(int burst) { struct timeval start, end, delta; unsigned int i, j; init_devlist(true); gettimeofday(&start, NULL); for (i = 0; i < MAX_LOOPS; i++) { for (j = 0; j < burst - 1; j++) (void)dma_enqueue(10, 0, NULL, NULL, 0, 0); dma_enqueue(10, 0, NULL, NULL, 0, 1); } gettimeofday(&end, NULL); timersub(&end, &start, &delta); printf("last_enqueue_issue_doorbell: burst:%d cost:%us.%uus \n", burst, delta.tv_sec, delta.tv_usec); } void main(int argc, char *argv[]) { if (argc < 2) { printf("please input burst parameter!\n"); return; } init_global(); int burst = atol(argv[1]); printf("burst = %d \n", burst); test_for_perform_after_multiple_enqueue(burst); test_for_doorbell_for_every_enqueue(burst); test_for_last_enqueue_issue_doorbell(burst); }quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
<snip>quoted
quoted
quoted
quoted
+ +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Returns the number of operations that failed to complete. + * NOTE: This API was used when rte_dmadev_completed has_error was set. + * + * @param dev_id + * The identifier of the device. + * @param vq_id + * The identifier of virt queue.(> + * @param nb_statusquoted
+ * Indicates the size of status array. + * @param[out] status + * The error code of operations that failed to complete. + * @param[out] cookie + * The last failed completed operation's cookie. + * + * @return + * The number of operations that failed to complete. + * + * NOTE: The caller must ensure that the input parameter is valid and the + * corresponding device supports the operation. + */ +__rte_experimental +static inline uint16_t +rte_dmadev_completed_fails(uint16_t dev_id, uint16_t vq_id, + const uint16_t nb_status, uint32_t *status, + dma_cookie_t *cookie)IMO, it is better to move cookie/rind_idx at 3. Why it would return any array of errors? since it called after rte_dmadev_completed() has has_error. Is it better to change rte_dmadev_error_status((uint16_t dev_id, uint16_t vq_id, dma_cookie_t *cookie, uint32_t *status) I also think, we may need to set status as bitmask and enumerate all the combination of error codes of all the driver and return string from driver existing rte_flow_error See struct rte_flow_error { enum rte_flow_error_type type; /**< Cause field and error types. */ const void *cause; /**< Object responsible for the error. */ const char *message; /**< Human-readable error message. */ };I think we need a multi-return value API here, as we may add operations in future which have non-error status values to return. The obvious case is DMA engines which support "compare" operations. In that case a successful compare (as in there were no DMA or HW errors) can return "equal" or "not-equal" as statuses. For general "copy" operations, the faster completion op can be used to just return successful values (and only call this status version on error), while apps using those compare ops or a mixture of copy and compare ops, would always use the slower one that returns status values for each and every op.. The ioat APIs used 32-bit integer values for this status array so as to allow e.g. 16-bits for error code and 16-bits for future status values. For most operations there should be a fairly small set of things that can go wrong, i.e. bad source address, bad destination address or invalid length. Within that we may have a couple of specifics for why an address is bad, but even so I don't think we need to start having multiple bit combinations.OK. What is the purpose of errors status? Is it for application printing it or Does the application need to take any action based on specific error requests?It's largely for information purposes, but in the case of SVA/SVM errors could occur due to the memory not being pinned, i.e. a page fault, in some cases. If that happens, then it's up the app to either touch the memory and retry the copy, or to do a SW memcpy as a fallback. In other error cases, I think it's good to tell the application if it's passing around bad data, or data that is beyond the scope of hardware, e.g. a copy that is beyond what can be done in a single transaction for a HW instance. Given that there are always things that can go wrong, I think we need some error reporting mechanism.quoted
If the former is scope, then we need to define the standard enum value for the error right? ie. uint32_t *status needs to change to enum rte_dma_error or so.Sure. Perhaps an error/status structure either is an option, where we explicitly call out error info from status info.Agree. Better to have a structure with filed like, 1) enum rte_dma_error_type 2) memory to store, informative message on fine aspects of error. LIke address caused issue etc.(Which will be driver-specific information).The only issue I have with that is that once we have driver specific information it is of little use to the application, since it can't know anything about it excepy maybe log it. I'd much rather have a set of error codes telling user that "source address is wrong", "dest address is wrong", and a generic "an address is wrong" in case driver/HW cannot distinguish source of error. Can we see how far we get with just error codes before we start into passing string messages around and all the memory management headaches that implies.Works for me. It should be "enum rte_dma_error_type" then, which has a standard error type. Which is missing in the spec now.+1 .