Thread (10 messages) 10 messages, 2 authors, 2021-01-14

[dpdk-dev] 回复: [RFC PATCH v1 4/6] app/eventdev: add release barriers for pipeline test

From: Feifei Wang <hidden>
Date: 2021-01-11 01:57:25

-----邮件原件-----
发件人: Pavan Nikhilesh Bhagavatula [off-list ref]
发送时间: 2021年1月8日 18:58
收件人: Feifei Wang [off-list ref]; jerinj@marvell.com; Harry
van Haaren [off-list ref]
抄送: dev@dpdk.org; nd [off-list ref]; Honnappa Nagarahalli
[off-list ref]; stable@dpdk.org; Ruifeng Wang
[off-list ref]; nd [off-list ref]; nd [off-list ref]; nd
[off-list ref]
主题: RE: [RFC PATCH v1 4/6] app/eventdev: add release barriers for pipeline
test

Hi Feifei,
quoted
Hi, Pavan
quoted
-----邮件原件-----
发件人: Pavan Nikhilesh Bhagavatula [off-list ref]
发送时间: 2021年1月8日 17:13
收件人: Feifei Wang [off-list ref]; jerinj@marvell.com;
Harry
quoted
van Haaren [off-list ref]
抄送: dev@dpdk.org; nd [off-list ref]; Honnappa Nagarahalli
[off-list ref]; stable@dpdk.org; Ruifeng Wang
[off-list ref]; nd [off-list ref]; nd [off-list ref]
主题: RE: [RFC PATCH v1 4/6] app/eventdev: add release barriers for
pipeline
quoted
test

Hi Feifei,
quoted
Hi, Pavan


quoted
-----邮件原件-----
quoted
发件人: Pavan Nikhilesh Bhagavatula
<mailto:pbhagavatula@marvell.com>
quoted
quoted
quoted
发送时间: 2021年1月5日 17:29
quoted
收件人: Feifei Wang <mailto:Feifei.Wang2@arm.com>;
mailto:jerinj@marvell.com;
quoted
quoted
Harry
quoted
van Haaren <mailto:harry.van.haaren@intel.com>
quoted
抄送: mailto:dev@dpdk.org; nd <mailto:nd@arm.com>; Honnappa
Nagarahalli
quoted
quoted
quoted
<mailto:Honnappa.Nagarahalli@arm.com>;
mailto:stable@dpdk.org; Ruifeng Wang
quoted
quoted
quoted
<mailto:Ruifeng.Wang@arm.com>; nd <mailto:nd@arm.com>
quoted
主题: RE: [RFC PATCH v1 4/6] app/eventdev: add release barriers
for
quoted
quoted
pipeline
quoted
test
quoted
quoted
Hi Feifei,
quoted
quoted
quoted
Hi, Pavan
quoted
quoted
quoted
quoted
Sorry for my late reply and thanks very much for your review.
quoted
quoted
quoted
quoted
quoted
-----Original Message-----
quoted
quoted
quoted
From: Pavan Nikhilesh Bhagavatula
<mailto:pbhagavatula@marvell.com<mailto:pbhagavatula@marvell.c
om>>
quoted
quoted
quoted
quoted
quoted
Sent: 2020年12月22日 18:33
quoted
quoted
quoted
To: Feifei Wang
<mailto:Feifei.Wang2@arm.com<mailto:Feifei.Wang2@arm.com>>;
mailto:jerinj@marvell.com<mailto:jerinj@marvell.com>;
quoted
quoted
Harry van
quoted
quoted
quoted
Haaren
<mailto:harry.van.haaren@intel.com<mailto:harry.van.haaren@intel.
com>>;
quoted
quoted
Pavan Nikhilesh
quoted
quoted
quoted
<pbhagavatula@caviumnetworks.com<mailto:pbhagavatula@cavium
n
quoted
quoted
etworks.com>>
quoted
quoted
quoted
Cc: mailto:dev@dpdk.org<mailto:dev@dpdk.org>; nd
<mailto:nd@arm.com<mailto:nd@arm.com>>; Honnappa
Nagarahalli
quoted
quoted
quoted
quoted
quoted
<Honnappa.Nagarahalli@arm.com<mailto:Honnappa.Nagarahalli@ar
m
quoted
quoted
.com>>; mailto:stable@dpdk.org<mailto:stable@dpdk.org>; Phil
Yang
quoted
quoted
quoted
quoted
quoted
<mailto:Phil.Yang@arm.com<mailto:Phil.Yang@arm.com>>
quoted
quoted
quoted
Subject: RE: [RFC PATCH v1 4/6] app/eventdev: add release
barriers
quoted
quoted
for
quoted
quoted
quoted
pipeline test
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
Add release barriers before updating the processed packets for
quoted
quoted
worker
quoted
quoted
quoted
quoted
lcores to ensure the worker lcore has really finished data
quoted
quoted
quoted
quoted
processing and then it can update the processed packets
number.
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
I believe we can live with minor inaccuracies in stats being
quoted
quoted
quoted
presented
quoted
quoted
as
quoted
quoted
quoted
atomics are pretty heavy when scheduler is limited to burst
size as
1.
quoted
quoted
quoted
quoted
quoted
quoted
One option is to move it before a pipeline operation
quoted
quoted
(pipeline_event_tx,
quoted
quoted
quoted
pipeline_fwd_event etc.) as they imply implicit release barrier
(as
quoted
quoted
quoted
all
quoted
quoted
the
quoted
quoted
quoted
changes done to the event should be visible to the next core).
quoted
quoted
quoted
quoted
If I understand correctly, your meaning is that move release
barriers
quoted
quoted
before pipeline_event_tx or pipeline_fwd_event. This can ensure
the
quoted
quoted
quoted
quoted
event has been processed before the next core begins to tx/fwd.
For
quoted
quoted
quoted
quoted
example:
quoted
quoted
What I meant was event APIs such as `rte_event_enqueue_burst`,
quoted
`rte_event_eth_tx_adapter_enqueue`
quoted
act as an implicit release barrier and the API
`rte_event_dequeue_burst` act
quoted
as an implicit acquire barrier.

quoted
quoted
Since, pipeline_* test starts with a dequeue() and ends with an
enqueue() I
quoted
don’t believe we need barriers in Between.


Sorry for my misunderstanding this. And I agree with you that no
barriers are

needed between dequeue and enqueue.



Now, let's go back to the beginning. Actually with this patch, our
barrier is mainly

for the synchronous variable " w->processed_pkts ". As we all know,
the
quoted
quoted
event is firstly

dequeued and then enqueued, after this, the event can be treated as
the
quoted
quoted
processed event

and included in the statistics("w->processed_pkts++").



Thus, we add a release barrier before " w->processed_pkts++" is to
prevent this operation

being executed ahead of time. For example:

dequeue  ->  w->processed_pkts++  ->  enqueue

This cause that the worker doesn't actually finish this event
processing, but the event is treated

as the processed one and included in the statistics.
But the current sequence is dequeue-> enqueue-> w- processed_pkts++
and enqueue already acts as an explicit release barrier right?
Sorry maybe I cannot understand how “enqueue” as an explicit release
barrier. I think of two possibilities:
1. As you say before, all the changes done to the event should be
visible to the next core and enqueue is a operation for event, so the
next core should wait for the event to be enqueued.
I think this is due to data dependence for the same variable. However,
‘w->processed_pkts’ and ‘ev’ are different variables, so this cannot
prevent ‘w->processed_pkts++’ before enqueue.
And the main core may load updated ‘w->processed_pkts’ but actually the
event is still being processed. For example:

Time Slot	   Worker 1                                      Main core
  1                           dequeue
  2                  w->processed_pkts++
  3                       		                              load w->processed_pkts
  4                          enqueue

2. Some release barriers have been included in enqueue. There is a
release barrier in rte_ring_enqueue :
move head -> copy elements to the ring -> release barrier -> update
tail
-> w->processed_pkts++
However, this barrier cannot prevent ‘w->processed_pkts++’ before
update tail, and when update_tail has been finished, the enqueue
process  can be seen completed.
I was talking about case 2 in particular almost all enqueue calls have some
kind of release barrier in place. I do agree w->processed_pkts++ might get
reordered with tail update but since enqueue itself is a ldr + blr I was hoping
that it wouldn't occur.

We can continue the discussion once I have some performance data.
Ok, that's great. I think this is a meaningful discussion. Thanks for your effort~.

Best Regards
Feifei
Thanks for your patience :)
Pavan.
quoted
quoted
quoted
________________________________________________________
_
quoted
quoted
_
quoted
____________________



By the way, I have two other questions about pipeline process test
in "test_pipeline_queue".

1. when do we start counting processed events (w-
processed_pkts)?
quoted
For the fwd mode (internal_port = false), when we choose single
stage,
quoted
quoted
application increments

the number events processed after "pipeline_event enqueue".
However, when we choose multiple

stage, application increments the number events processed before
"pipleline_event_enqueue".
We count an event as process when all the stages have completed
and its
quoted
Trasnmitted.
quoted
So,

maybe we can unify this. For example of multiple stage:



                               if (cq_id == last_queue) {

                                               ev.queue_id =
tx_queue[ev.mbuf->port];


rte_event_eth_tx_adapter_txq_set(ev.mbuf,
0);


pipeline_fwd_event(&ev, RTE_SCHED_TYPE_ATOMIC);

                               +             pipeline_event_enqueue(dev, port, &ev);

                                               w->processed_pkts++;

                               } else {

                                               ev.queue_id++;


pipeline_fwd_event(&ev, sched_type_list[cq_id]);

                               +             pipeline_event_enqueue(dev, port, &ev);

                               }



               -              pipeline_event_enqueue(dev, port, &ev);
The above change makes sense.
Thanks for your review, and I’ll update this change into the next
version.
quoted
quoted
2. Whether  "pipeline_event_enqueue" is needed after
"pipeline_event_tx" for tx mode?

For single_stage_burst_tx mode, after "pipeline_event_tx", the
worker
quoted
quoted
has to enqueue again

due to  "pipeline_event_enqueue_burst", so maybe we should jump
out of
quoted
quoted
the loop after

“pipeline_event_tx”,
We call enqueue burst to release the events i.e. enqueue events with
RTE_EVENT_OP_RELEASE.
However,
In case of single event, for ' pipeline_queue_worker_single_stage_tx'
and ' pipeline_queue_worker_multi_stage_tx',
after tx, there is no release operation.
quoted
quoted
 for example:
quoted


                                               if (ev[i].sched_type
==
RTE_SCHED_TYPE_ATOMIC) {


pipeline_event_tx(dev, port, &ev[i]);


ev[i].op = RTE_EVENT_OP_RELEASE;


w->processed_pkts++;

                                               +             continue;

                                               } else {


ev[i].queue_id++;


pipeline_fwd_event(&ev[i],


RTE_SCHED_TYPE_ATOMIC);

                                               }

                               }



                               pipeline_event_enqueue_burst(dev,
port, ev, nb_rx);




quoted
quoted
quoted
quoted
quoted
if (ev.sched_type == RTE_SCHED_TYPE_ATOMIC) {
quoted
quoted
                         +
__atomic_thread_fence(__ATOMIC_RELEASE);
quoted
quoted
                                         pipeline_event_tx(dev,
port, &ev);
quoted
quoted
                                         w->processed_pkts++;
quoted
quoted
                         } else {
quoted
quoted
                                         ev.queue_id++;
quoted
quoted
                         +
__atomic_thread_fence(__ATOMIC_RELEASE);
quoted
quoted
                                         pipeline_fwd_event(&ev,
quoted
quoted
RTE_SCHED_TYPE_ATOMIC);
quoted
quoted
pipeline_event_enqueue(dev, port, &ev);
quoted
quoted
quoted
quoted
However, there are two reasons to prevent this:
quoted
quoted
quoted
quoted
First, compare with other tests in app/eventdev, for example, the
quoted
quoted
eventdev perf test, the wmb is after event operation to ensure
quoted
quoted
operation has been finished and then w->processed_pkts++.
quoted
quoted
In case of perf_* tests start with a dequeue() and finally ends
with a
quoted
mempool_put() should also act as implicit acquire release pairs
making stats
quoted
consistent?


For perf tests, this consistency refers to that there is a wmb after
mempool_put().

Please refer to this link:

https://urldefense.proofpoint.com/v2/url?u=http-
3A__patches.dpdk.org_patch_85634_&d=DwIGaQ&c=nKjWec2b6R0m
O
quoted
yPaz7xtfQ&r=1cjuAHrGh745jHNmj2fD85sUMIJ2IPIDsIJzo6FN6Z0&m=z
g
quoted
quoted
QHeSDiXWfI1PIIUxXBqMS6E-
2_3G46nhrzGXoBpHI&s=0FwTxPXjWflh-
quoted
quoted
sdmnkY133IPlJB780x0yxe7Am3JCBw&e=


quoted
quoted
quoted
So, if we move release barriers before tx/fwd, it may cause that
the
quoted
quoted
quoted
quoted
tests of app/eventdev become  inconsistent.This may reduce the
quoted
quoted
maintainability of the code and make it difficult to understand.
quoted
quoted
quoted
quoted
Second, it is a test case, though heavy thread may cause
performance
quoted
quoted
degradation, it can ensure that the operation process and the
test
quoted
quoted
result are correct. And maybe for a test case, correctness is
more
quoted
quoted
important than performance.
quoted
quoted
quoted
quoted
Most of our internal perf test run on 24/48 core combinations and
since
quoted
Octeontx2 event device driver supports a burst size of 1, it will
show
up as
quoted
Huge performance degradation.


For the impact on performance, I do the test using software driver,
following are some test results:

--------------------------------------------------------------------
---
----------------
---------------------------------------------

Architecture: aarch64

Nics: ixgbe-82599

CPU: Cortex-A72

BURST_SIZE: 1

Order: ./dpdk-test-eventdev -l 0-15 -s 0x2 --vdev=event_sw0 -- --
test=pipeline_queue --wlcore=4-14 --prod_type_ethdev --stlist=a,a

Flow: one flow, 64bits package, TX rate: 1.4Mpps



Without this patch:

0.954 mpps avg 0.953 mpps



With this patch:

0.932 mpps avg 0.930 mpps

--------------------------------------------------------------------
---
----------------
---------------------------------------------



Based on the result above, there is no significant performance
degradation with this patch.

This is because the release barrier is only for  “w-
processed_pkts++”.
quoted
It just ensures that the worker core

increments the number events processed after enqueue, and it
doesn’t
quoted
quoted
affect dequeue/enqueue:



dequeue -> enqueue -> release barrier -> w->processed_pkts++
Here enqueue already acts as an explicit release barrier.
Please refer above reasons.
quoted
quoted

On the other hand, I infer the reason for the slight decrease in
measurement performance is that the release barrier

prevent “w->processed_pkts++” before that the event has been
processed
quoted
quoted
(enqueue). But I think this test result is closer

to the real performance.

And sorry for that we have no octentx2 device, so there is no test
result on Octeontx2 event device driver. Would you please

help us test this patch on octentx2 when you are convenient. Thanks
very much.
I will report the performance numbers on Monday.
That’s great, Thanks very much for your help.

Best Regards
Feifei
quoted
quoted

Best Regards

Feifei
Regards,
Pavan.
quoted

quoted
quoted
quoted
So, due to two reasons above, I'm ambivalent about how we
should
quoted
quoted
do in
quoted
quoted
the next step.
quoted
quoted
quoted
quoted
Best Regards
quoted
quoted
Feifei
quoted
quoted
Regards,
quoted
Pavan.
quoted
quoted
quoted
quoted
quoted
quoted
quoted
Fixes: 314bcf58ca8f ("app/eventdev: add pipeline queue
worker
quoted
quoted
quoted
quoted
quoted
quoted
functions")
quoted
quoted
quoted
quoted
Cc:
mailto:pbhagavatula@marvell.com<mailto:pbhagavatula@marvell.co
m>
quoted
quoted
quoted
quoted
quoted
quoted
Cc: mailto:stable@dpdk.org<mailto:stable@dpdk.org>
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
Signed-off-by: Phil Yang
<mailto:phil.yang@arm.com<mailto:phil.yang@arm.com>>
quoted
quoted
quoted
quoted
Signed-off-by: Feifei Wang
<mailto:feifei.wang2@arm.com<mailto:feifei.wang2@arm.com>>
quoted
quoted
quoted
quoted
Reviewed-by: Ruifeng Wang
<mailto:ruifeng.wang@arm.com<mailto:ruifeng.wang@arm.com>>
quoted
quoted
quoted
quoted
---
quoted
quoted
quoted
quoted
app/test-eventdev/test_pipeline_queue.c | 64
quoted
quoted
quoted
quoted
+++++++++++++++++++++----
quoted
quoted
quoted
quoted
1 file changed, 56 insertions(+), 8 deletions(-)
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
diff --git a/app/test-eventdev/test_pipeline_queue.c
b/app/test-
quoted
quoted
quoted
quoted
quoted
quoted
eventdev/test_pipeline_queue.c index 7bebac34f..0c0ec0ceb
quoted
quoted
100644
quoted
quoted
quoted
quoted
--- a/app/test-eventdev/test_pipeline_queue.c
quoted
quoted
quoted
quoted
+++ b/app/test-eventdev/test_pipeline_queue.c
quoted
quoted
quoted
quoted
@@ -30,7 +30,13 @@
pipeline_queue_worker_single_stage_tx(void
quoted
quoted
quoted
quoted
*arg)
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
                   if (ev.sched_type ==
RTE_SCHED_TYPE_ATOMIC) {
quoted
quoted
quoted
quoted
                                   pipeline_event_tx(dev,
port, &ev);
quoted
quoted
quoted
quoted
-                                  w->processed_pkts++;
quoted
quoted
quoted
quoted
+
quoted
quoted
quoted
quoted
+                                 /* release barrier here
+ ensures stored
operation
quoted
quoted
quoted
quoted
+                                 * of the event completes
+ before the number
of
quoted
quoted
quoted
quoted
+                                 * processed pkts is visible
+ to the main core
quoted
quoted
quoted
quoted
+                                 */
quoted
quoted
quoted
quoted
+
+ __atomic_fetch_add(&(w->processed_pkts),
1,
quoted
quoted
quoted
quoted
+
+ __ATOMIC_RELEASE);
quoted
quoted
quoted
quoted
                   } else {
quoted
quoted
quoted
quoted
                                   ev.queue_id++;
quoted
quoted
quoted
quoted
                                   pipeline_fwd_event(&ev,
quoted
quoted
quoted
quoted
RTE_SCHED_TYPE_ATOMIC);
quoted
quoted
quoted
quoted
@@ -59,7 +65,13 @@
quoted
quoted
pipeline_queue_worker_single_stage_fwd(void
quoted
quoted
quoted
quoted
*arg)
quoted
quoted
quoted
quoted
                   rte_event_eth_tx_adapter_txq_set(ev.mbuf,
0);
quoted
quoted
quoted
quoted
                   pipeline_fwd_event(&ev,
RTE_SCHED_TYPE_ATOMIC);
quoted
quoted
quoted
quoted
                   pipeline_event_enqueue(dev, port, &ev);
quoted
quoted
quoted
quoted
-                  w->processed_pkts++;
quoted
quoted
quoted
quoted
+
quoted
quoted
quoted
quoted
+                 /* release barrier here ensures stored
+ operation
quoted
quoted
quoted
quoted
+                 * of the event completes before the number
+ of
quoted
quoted
quoted
quoted
+                 * processed pkts is visible to the main core
quoted
quoted
quoted
quoted
+                 */
quoted
quoted
quoted
quoted
+                 __atomic_fetch_add(&(w->processed_pkts), 1,
quoted
quoted
quoted
quoted
+
+ __ATOMIC_RELEASE);
quoted
quoted
quoted
quoted
   }
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
   return 0;
quoted
quoted
quoted
quoted
@@ -84,7 +96,13 @@
quoted
quoted
quoted
quoted
pipeline_queue_worker_single_stage_burst_tx(void *arg)
quoted
quoted
quoted
quoted
                                   if (ev[i].sched_type ==
quoted
quoted
quoted
quoted
RTE_SCHED_TYPE_ATOMIC) {
quoted
quoted
quoted
quoted
pipeline_event_tx(dev, port, &ev[i]);
quoted
quoted
quoted
quoted
                                                   ev[i].op =
RTE_EVENT_OP_RELEASE;
quoted
quoted
quoted
quoted
-                                                  w->processed_pkts++;
quoted
quoted
quoted
quoted
+
quoted
quoted
quoted
quoted
+                                                 /* release
+ barrier here ensures stored
quoted
quoted
quoted
quoted
operation
quoted
quoted
quoted
quoted
+                                                 * of the
+ event completes before the
quoted
quoted
quoted
quoted
number of
quoted
quoted
quoted
quoted
+                                                 * processed
+ pkts is visible to the main
quoted
quoted
quoted
quoted
core
quoted
quoted
quoted
quoted
+                                                 */
quoted
quoted
quoted
quoted
+
+ __atomic_fetch_add(&(w-
quoted
quoted
quoted
quoted
quoted
processed_pkts), 1,
quoted
quoted
quoted
quoted
+
__ATOMIC_RELEASE);
quoted
quoted
quoted
quoted
                                   } else {
quoted
quoted
quoted
quoted
ev[i].queue_id++;
quoted
quoted
quoted
quoted
pipeline_fwd_event(&ev[i],
quoted
quoted
quoted
quoted
@@ -121,7 +139,13 @@
quoted
quoted
quoted
quoted
pipeline_queue_worker_single_stage_burst_fwd(void *arg)
quoted
quoted
quoted
quoted
                   }
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
                   pipeline_event_enqueue_burst(dev, port,
ev, nb_rx);
quoted
quoted
quoted
quoted
-                  w->processed_pkts += nb_rx;
quoted
quoted
quoted
quoted
+
quoted
quoted
quoted
quoted
+                 /* release barrier here ensures stored
+ operation
quoted
quoted
quoted
quoted
+                 * of the event completes before the number
+ of
quoted
quoted
quoted
quoted
+                 * processed pkts is visible to the main core
quoted
quoted
quoted
quoted
+                 */
quoted
quoted
quoted
quoted
+                 __atomic_fetch_add(&(w->processed_pkts),
+ nb_rx,
quoted
quoted
quoted
quoted
+
+ __ATOMIC_RELEASE);
quoted
quoted
quoted
quoted
   }
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
   return 0;
quoted
quoted
quoted
quoted
@@ -146,7 +170,13 @@
quoted
quoted
pipeline_queue_worker_multi_stage_tx(void
quoted
quoted
quoted
quoted
*arg)
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
                   if (ev.queue_id ==
tx_queue[ev.mbuf->port]) {
quoted
quoted
quoted
quoted
                                   pipeline_event_tx(dev,
port, &ev);
quoted
quoted
quoted
quoted
-                                  w->processed_pkts++;
quoted
quoted
quoted
quoted
+
quoted
quoted
quoted
quoted
+                                 /* release barrier here
+ ensures stored
operation
quoted
quoted
quoted
quoted
+                                 * of the event completes
+ before the number
of
quoted
quoted
quoted
quoted
+                                 * processed pkts is visible
+ to the main core
quoted
quoted
quoted
quoted
+                                 */
quoted
quoted
quoted
quoted
+
+ __atomic_fetch_add(&(w->processed_pkts),
1,
quoted
quoted
quoted
quoted
+
+ __ATOMIC_RELEASE);
quoted
quoted
quoted
quoted
                                   continue;
quoted
quoted
quoted
quoted
                   }
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
@@ -180,7 +210,13 @@
quoted
quoted
quoted
quoted
pipeline_queue_worker_multi_stage_fwd(void *arg)
quoted
quoted
quoted
quoted
                                   ev.queue_id =
tx_queue[ev.mbuf->port];
quoted
quoted
quoted
quoted
rte_event_eth_tx_adapter_txq_set(ev.mbuf,
0);
quoted
quoted
quoted
quoted
                                   pipeline_fwd_event(&ev,
quoted
quoted
quoted
quoted
RTE_SCHED_TYPE_ATOMIC);
quoted
quoted
quoted
quoted
-                                  w->processed_pkts++;
quoted
quoted
quoted
quoted
+
quoted
quoted
quoted
quoted
+                                 /* release barrier here
+ ensures stored
operation
quoted
quoted
quoted
quoted
+                                 * of the event completes
+ before the number
of
quoted
quoted
quoted
quoted
+                                 * processed pkts is visible
+ to the main core
quoted
quoted
quoted
quoted
+                                 */
quoted
quoted
quoted
quoted
+
+ __atomic_fetch_add(&(w->processed_pkts),
1,
quoted
quoted
quoted
quoted
+
+ __ATOMIC_RELEASE);
quoted
quoted
quoted
quoted
                   } else {
quoted
quoted
quoted
quoted
                                   ev.queue_id++;
quoted
quoted
quoted
quoted
                                   pipeline_fwd_event(&ev,
quoted
quoted
quoted
quoted
sched_type_list[cq_id]);
quoted
quoted
quoted
quoted
@@ -214,7 +250,13 @@
quoted
quoted
quoted
quoted
pipeline_queue_worker_multi_stage_burst_tx(void *arg)
quoted
quoted
quoted
quoted
                                   if (ev[i].queue_id ==
tx_queue[ev[i].mbuf-
quoted
quoted
quoted
quoted
quoted
port]) {
quoted
quoted
quoted
quoted
pipeline_event_tx(dev, port, &ev[i]);
quoted
quoted
quoted
quoted
                                                   ev[i].op =
RTE_EVENT_OP_RELEASE;
quoted
quoted
quoted
quoted
-                                                  w->processed_pkts++;
quoted
quoted
quoted
quoted
+
quoted
quoted
quoted
quoted
+                                                 /* release
+ barrier here ensures stored
quoted
quoted
quoted
quoted
operation
quoted
quoted
quoted
quoted
+                                                 * of the
+ event completes before the
quoted
quoted
quoted
quoted
number of
quoted
quoted
quoted
quoted
+                                                 * processed
+ pkts is visible to the main
quoted
quoted
quoted
quoted
core
quoted
quoted
quoted
quoted
+                                                 */
quoted
quoted
quoted
quoted
+
+ __atomic_fetch_add(&(w-
quoted
quoted
quoted
quoted
quoted
processed_pkts), 1,
quoted
quoted
quoted
quoted
+
__ATOMIC_RELEASE);
quoted
quoted
quoted
quoted
                                                   continue;
quoted
quoted
quoted
quoted
                                   }
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
@@ -254,7 +296,13 @@
quoted
quoted
quoted
quoted
pipeline_queue_worker_multi_stage_burst_fwd(void *arg)
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
   rte_event_eth_tx_adapter_txq_set(ev[i].mbuf, 0);
quoted
quoted
quoted
quoted
pipeline_fwd_event(&ev[i],
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
   RTE_SCHED_TYPE_ATOMIC);
quoted
quoted
quoted
quoted
-                                                  w->processed_pkts++;
quoted
quoted
quoted
quoted
+
quoted
quoted
quoted
quoted
+                                                 /* release
+ barrier here ensures stored
quoted
quoted
quoted
quoted
operation
quoted
quoted
quoted
quoted
+                                                 * of the
+ event completes before the
quoted
quoted
quoted
quoted
number of
quoted
quoted
quoted
quoted
+                                                 * processed
+ pkts is visible to the main
quoted
quoted
quoted
quoted
core
quoted
quoted
quoted
quoted
+                                                 */
quoted
quoted
quoted
quoted
+
+ __atomic_fetch_add(&(w-
quoted
quoted
quoted
quoted
quoted
processed_pkts), 1,
quoted
quoted
quoted
quoted
+
__ATOMIC_RELEASE);
quoted
quoted
quoted
quoted
                                   } else {
quoted
quoted
quoted
quoted
ev[i].queue_id++;
quoted
quoted
quoted
quoted
pipeline_fwd_event(&ev[i],
quoted
quoted
quoted
quoted
--
quoted
quoted
quoted
quoted
2.17.1
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help