[dpdk-dev] 回复: [RFC PATCH v1 4/6] app/eventdev: add release barriers for pipeline test
From: Feifei Wang <hidden>
Date: 2021-01-11 01:57:25
-----邮件原件----- 发件人: Pavan Nikhilesh Bhagavatula [off-list ref] 发送时间: 2021年1月8日 18:58 收件人: Feifei Wang [off-list ref]; jerinj@marvell.com; Harry van Haaren [off-list ref] 抄送: dev@dpdk.org; nd [off-list ref]; Honnappa Nagarahalli [off-list ref]; stable@dpdk.org; Ruifeng Wang [off-list ref]; nd [off-list ref]; nd [off-list ref]; nd [off-list ref] 主题: RE: [RFC PATCH v1 4/6] app/eventdev: add release barriers for pipeline test Hi Feifei,quoted
Hi, Pavanquoted
-----邮件原件----- 发件人: Pavan Nikhilesh Bhagavatula [off-list ref] 发送时间: 2021年1月8日 17:13 收件人: Feifei Wang [off-list ref]; jerinj@marvell.com;Harryquoted
van Haaren [off-list ref] 抄送: dev@dpdk.org; nd [off-list ref]; Honnappa Nagarahalli [off-list ref]; stable@dpdk.org; Ruifeng Wang [off-list ref]; nd [off-list ref]; nd [off-list ref] 主题: RE: [RFC PATCH v1 4/6] app/eventdev: add release barriers forpipelinequoted
test Hi Feifei,quoted
Hi, Pavanquoted
-----邮件原件-----quoted
发件人: Pavan Nikhilesh Bhagavatula<mailto:pbhagavatula@marvell.com>quoted
quoted
quoted
发送时间: 2021年1月5日 17:29quoted
收件人: Feifei Wang <mailto:Feifei.Wang2@arm.com>;mailto:jerinj@marvell.com;quoted
quoted
Harryquoted
van Haaren <mailto:harry.van.haaren@intel.com>quoted
抄送: mailto:dev@dpdk.org; nd <mailto:nd@arm.com>; HonnappaNagarahalliquoted
quoted
quoted
<mailto:Honnappa.Nagarahalli@arm.com>;mailto:stable@dpdk.org; Ruifeng Wangquoted
quoted
quoted
<mailto:Ruifeng.Wang@arm.com>; nd <mailto:nd@arm.com>quoted
主题: RE: [RFC PATCH v1 4/6] app/eventdev: add release barriersforquoted
quoted
pipelinequoted
testquoted
quoted
Hi Feifei,quoted
quoted
quoted
Hi, Pavanquoted
quoted
quoted
quoted
Sorry for my late reply and thanks very much for your review.quoted
quoted
quoted
quoted
quoted
-----Original Message-----quoted
quoted
quoted
From: Pavan Nikhilesh Bhagavatula<mailto:pbhagavatula@marvell.com<mailto:pbhagavatula@marvell.com>>quoted
quoted
quoted
quoted
quoted
Sent: 2020年12月22日 18:33quoted
quoted
quoted
To: Feifei Wang<mailto:Feifei.Wang2@arm.com<mailto:Feifei.Wang2@arm.com>>; mailto:jerinj@marvell.com<mailto:jerinj@marvell.com>;quoted
quoted
Harry vanquoted
quoted
quoted
Haaren<mailto:harry.van.haaren@intel.com<mailto:harry.van.haaren@intel.com>>;quoted
quoted
Pavan Nikhileshquoted
quoted
quoted
<pbhagavatula@caviumnetworks.com<mailto:pbhagavatula@caviumnquoted
quoted
etworks.com>>quoted
quoted
quoted
Cc: mailto:dev@dpdk.org<mailto:dev@dpdk.org>; nd<mailto:nd@arm.com<mailto:nd@arm.com>>; HonnappaNagarahalliquoted
quoted
quoted
quoted
quoted
<Honnappa.Nagarahalli@arm.com<mailto:Honnappa.Nagarahalli@armquoted
quoted
.com>>; mailto:stable@dpdk.org<mailto:stable@dpdk.org>; PhilYangquoted
quoted
quoted
quoted
quoted
<mailto:Phil.Yang@arm.com<mailto:Phil.Yang@arm.com>>quoted
quoted
quoted
Subject: RE: [RFC PATCH v1 4/6] app/eventdev: add releasebarriersquoted
quoted
forquoted
quoted
quoted
pipeline testquoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
Add release barriers before updating the processed packets forquoted
quoted
workerquoted
quoted
quoted
quoted
lcores to ensure the worker lcore has really finished dataquoted
quoted
quoted
quoted
processing and then it can update the processed packetsnumber.quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
I believe we can live with minor inaccuracies in stats beingquoted
quoted
quoted
presentedquoted
quoted
asquoted
quoted
quoted
atomics are pretty heavy when scheduler is limited to burst size as1.quoted
quoted
quoted
quoted
quoted
quoted
One option is to move it before a pipeline operationquoted
quoted
(pipeline_event_tx,quoted
quoted
quoted
pipeline_fwd_event etc.) as they imply implicit release barrier (asquoted
quoted
quoted
allquoted
quoted
thequoted
quoted
quoted
changes done to the event should be visible to the next core).quoted
quoted
quoted
quoted
If I understand correctly, your meaning is that move release barriersquoted
quoted
before pipeline_event_tx or pipeline_fwd_event. This can ensurethequoted
quoted
quoted
quoted
event has been processed before the next core begins to tx/fwd.Forquoted
quoted
quoted
quoted
example:quoted
quoted
What I meant was event APIs such as `rte_event_enqueue_burst`,quoted
`rte_event_eth_tx_adapter_enqueue`quoted
act as an implicit release barrier and the API`rte_event_dequeue_burst` actquoted
as an implicit acquire barrier.quoted
quoted
Since, pipeline_* test starts with a dequeue() and ends with anenqueue() Iquoted
don’t believe we need barriers in Between.Sorry for my misunderstanding this. And I agree with you that no barriers are needed between dequeue and enqueue. Now, let's go back to the beginning. Actually with this patch, our barrier is mainly for the synchronous variable " w->processed_pkts ". As we all know,thequoted
quoted
event is firstly dequeued and then enqueued, after this, the event can be treated asthequoted
quoted
processed event and included in the statistics("w->processed_pkts++"). Thus, we add a release barrier before " w->processed_pkts++" is to prevent this operation being executed ahead of time. For example: dequeue -> w->processed_pkts++ -> enqueue This cause that the worker doesn't actually finish this event processing, but the event is treated as the processed one and included in the statistics.But the current sequence is dequeue-> enqueue-> w- processed_pkts++ and enqueue already acts as an explicit release barrier right?Sorry maybe I cannot understand how “enqueue” as an explicit release barrier. I think of two possibilities: 1. As you say before, all the changes done to the event should be visible to the next core and enqueue is a operation for event, so the next core should wait for the event to be enqueued. I think this is due to data dependence for the same variable. However, ‘w->processed_pkts’ and ‘ev’ are different variables, so this cannot prevent ‘w->processed_pkts++’ before enqueue. And the main core may load updated ‘w->processed_pkts’ but actually the event is still being processed. For example: Time Slot Worker 1 Main core 1 dequeue 2 w->processed_pkts++ 3 load w->processed_pkts 4 enqueue 2. Some release barriers have been included in enqueue. There is a release barrier in rte_ring_enqueue : move head -> copy elements to the ring -> release barrier -> update tail -> w->processed_pkts++ However, this barrier cannot prevent ‘w->processed_pkts++’ before update tail, and when update_tail has been finished, the enqueue process can be seen completed.I was talking about case 2 in particular almost all enqueue calls have some kind of release barrier in place. I do agree w->processed_pkts++ might get reordered with tail update but since enqueue itself is a ldr + blr I was hoping that it wouldn't occur. We can continue the discussion once I have some performance data.
Ok, that's great. I think this is a meaningful discussion. Thanks for your effort~. Best Regards Feifei
Thanks for your patience :) Pavan.quoted
quoted
quoted
_________________________________________________________quoted
quoted
_quoted
____________________ By the way, I have two other questions about pipeline process test in "test_pipeline_queue". 1. when do we start counting processed events (w-processed_pkts)?quoted
For the fwd mode (internal_port = false), when we choose singlestage,quoted
quoted
application increments the number events processed after "pipeline_event enqueue". However, when we choose multiple stage, application increments the number events processed before "pipleline_event_enqueue".We count an event as process when all the stages have completedand itsquoted
Trasnmitted.quoted
So, maybe we can unify this. For example of multiple stage: if (cq_id == last_queue) { ev.queue_id = tx_queue[ev.mbuf->port]; rte_event_eth_tx_adapter_txq_set(ev.mbuf, 0); pipeline_fwd_event(&ev, RTE_SCHED_TYPE_ATOMIC); + pipeline_event_enqueue(dev, port, &ev); w->processed_pkts++; } else { ev.queue_id++; pipeline_fwd_event(&ev, sched_type_list[cq_id]); + pipeline_event_enqueue(dev, port, &ev); } - pipeline_event_enqueue(dev, port, &ev);The above change makes sense.Thanks for your review, and I’ll update this change into the next version.quoted
quoted
2. Whether "pipeline_event_enqueue" is needed after "pipeline_event_tx" for tx mode? For single_stage_burst_tx mode, after "pipeline_event_tx", theworkerquoted
quoted
has to enqueue again due to "pipeline_event_enqueue_burst", so maybe we should jumpout ofquoted
quoted
the loop after “pipeline_event_tx”,We call enqueue burst to release the events i.e. enqueue events with RTE_EVENT_OP_RELEASE.However, In case of single event, for ' pipeline_queue_worker_single_stage_tx' and ' pipeline_queue_worker_multi_stage_tx', after tx, there is no release operation.quoted
quoted
for example:quoted
if (ev[i].sched_type == RTE_SCHED_TYPE_ATOMIC) { pipeline_event_tx(dev, port, &ev[i]); ev[i].op = RTE_EVENT_OP_RELEASE; w->processed_pkts++; + continue; } else { ev[i].queue_id++; pipeline_fwd_event(&ev[i], RTE_SCHED_TYPE_ATOMIC); } } pipeline_event_enqueue_burst(dev, port, ev, nb_rx);quoted
quoted
quoted
quoted
quoted
if (ev.sched_type == RTE_SCHED_TYPE_ATOMIC) {quoted
quoted
+__atomic_thread_fence(__ATOMIC_RELEASE);quoted
quoted
pipeline_event_tx(dev, port, &ev);quoted
quoted
w->processed_pkts++;quoted
quoted
} else {quoted
quoted
ev.queue_id++;quoted
quoted
+__atomic_thread_fence(__ATOMIC_RELEASE);quoted
quoted
pipeline_fwd_event(&ev,quoted
quoted
RTE_SCHED_TYPE_ATOMIC);quoted
quoted
pipeline_event_enqueue(dev, port, &ev);quoted
quoted
quoted
quoted
However, there are two reasons to prevent this:quoted
quoted
quoted
quoted
First, compare with other tests in app/eventdev, for example, thequoted
quoted
eventdev perf test, the wmb is after event operation to ensurequoted
quoted
operation has been finished and then w->processed_pkts++.quoted
quoted
In case of perf_* tests start with a dequeue() and finally ends with aquoted
mempool_put() should also act as implicit acquire release pairsmaking statsquoted
consistent?For perf tests, this consistency refers to that there is a wmb after mempool_put(). Please refer to this link: https://urldefense.proofpoint.com/v2/url?u=http-3A__patches.dpdk.org_patch_85634_&d=DwIGaQ&c=nKjWec2b6R0mOquoted
yPaz7xtfQ&r=1cjuAHrGh745jHNmj2fD85sUMIJ2IPIDsIJzo6FN6Z0&m=zgquoted
quoted
QHeSDiXWfI1PIIUxXBqMS6E-2_3G46nhrzGXoBpHI&s=0FwTxPXjWflh-quoted
quoted
sdmnkY133IPlJB780x0yxe7Am3JCBw&e=quoted
quoted
quoted
So, if we move release barriers before tx/fwd, it may cause thatthequoted
quoted
quoted
quoted
tests of app/eventdev become inconsistent.This may reduce thequoted
quoted
maintainability of the code and make it difficult to understand.quoted
quoted
quoted
quoted
Second, it is a test case, though heavy thread may causeperformancequoted
quoted
degradation, it can ensure that the operation process and the testquoted
quoted
result are correct. And maybe for a test case, correctness is morequoted
quoted
important than performance.quoted
quoted
quoted
quoted
Most of our internal perf test run on 24/48 core combinations andsincequoted
Octeontx2 event device driver supports a burst size of 1, it will showup asquoted
Huge performance degradation.For the impact on performance, I do the test using software driver, following are some test results: -------------------------------------------------------------------- --- ---------------- --------------------------------------------- Architecture: aarch64 Nics: ixgbe-82599 CPU: Cortex-A72 BURST_SIZE: 1 Order: ./dpdk-test-eventdev -l 0-15 -s 0x2 --vdev=event_sw0 -- -- test=pipeline_queue --wlcore=4-14 --prod_type_ethdev --stlist=a,a Flow: one flow, 64bits package, TX rate: 1.4Mpps Without this patch: 0.954 mpps avg 0.953 mpps With this patch: 0.932 mpps avg 0.930 mpps -------------------------------------------------------------------- --- ---------------- --------------------------------------------- Based on the result above, there is no significant performance degradation with this patch. This is because the release barrier is only for “w-processed_pkts++”.quoted
It just ensures that the worker core increments the number events processed after enqueue, and itdoesn’tquoted
quoted
affect dequeue/enqueue: dequeue -> enqueue -> release barrier -> w->processed_pkts++Here enqueue already acts as an explicit release barrier.Please refer above reasons.quoted
quoted
On the other hand, I infer the reason for the slight decrease in measurement performance is that the release barrier prevent “w->processed_pkts++” before that the event has beenprocessedquoted
quoted
(enqueue). But I think this test result is closer to the real performance. And sorry for that we have no octentx2 device, so there is no test result on Octeontx2 event device driver. Would you please help us test this patch on octentx2 when you are convenient. Thanks very much.I will report the performance numbers on Monday.That’s great, Thanks very much for your help. Best Regards Feifeiquoted
quoted
Best Regards FeifeiRegards, Pavan.quoted
quoted
quoted
quoted
So, due to two reasons above, I'm ambivalent about how weshouldquoted
quoted
do inquoted
quoted
the next step.quoted
quoted
quoted
quoted
Best Regardsquoted
quoted
Feifeiquoted
quoted
Regards,quoted
Pavan.quoted
quoted
quoted
quoted
quoted
quoted
quoted
Fixes: 314bcf58ca8f ("app/eventdev: add pipeline queueworkerquoted
quoted
quoted
quoted
quoted
quoted
functions")quoted
quoted
quoted
quoted
Cc:mailto:pbhagavatula@marvell.com<mailto:pbhagavatula@marvell.com>quoted
quoted
quoted
quoted
quoted
quoted
Cc: mailto:stable@dpdk.org<mailto:stable@dpdk.org>quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
Signed-off-by: Phil Yang<mailto:phil.yang@arm.com<mailto:phil.yang@arm.com>>quoted
quoted
quoted
quoted
Signed-off-by: Feifei Wang<mailto:feifei.wang2@arm.com<mailto:feifei.wang2@arm.com>>quoted
quoted
quoted
quoted
Reviewed-by: Ruifeng Wang<mailto:ruifeng.wang@arm.com<mailto:ruifeng.wang@arm.com>>quoted
quoted
quoted
quoted
---quoted
quoted
quoted
quoted
app/test-eventdev/test_pipeline_queue.c | 64quoted
quoted
quoted
quoted
+++++++++++++++++++++----quoted
quoted
quoted
quoted
1 file changed, 56 insertions(+), 8 deletions(-)quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
diff --git a/app/test-eventdev/test_pipeline_queue.cb/app/test-quoted
quoted
quoted
quoted
quoted
quoted
eventdev/test_pipeline_queue.c index 7bebac34f..0c0ec0cebquoted
quoted
100644quoted
quoted
quoted
quoted
--- a/app/test-eventdev/test_pipeline_queue.cquoted
quoted
quoted
quoted
+++ b/app/test-eventdev/test_pipeline_queue.cquoted
quoted
quoted
quoted
@@ -30,7 +30,13 @@pipeline_queue_worker_single_stage_tx(voidquoted
quoted
quoted
quoted
*arg)quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
if (ev.sched_type == RTE_SCHED_TYPE_ATOMIC) {quoted
quoted
quoted
quoted
pipeline_event_tx(dev, port, &ev);quoted
quoted
quoted
quoted
- w->processed_pkts++;quoted
quoted
quoted
quoted
+quoted
quoted
quoted
quoted
+ /* release barrier here + ensures storedoperationquoted
quoted
quoted
quoted
+ * of the event completes + before the numberofquoted
quoted
quoted
quoted
+ * processed pkts is visible + to the main corequoted
quoted
quoted
quoted
+ */quoted
quoted
quoted
quoted
+ + __atomic_fetch_add(&(w->processed_pkts),1,quoted
quoted
quoted
quoted
+ + __ATOMIC_RELEASE);quoted
quoted
quoted
quoted
} else {quoted
quoted
quoted
quoted
ev.queue_id++;quoted
quoted
quoted
quoted
pipeline_fwd_event(&ev,quoted
quoted
quoted
quoted
RTE_SCHED_TYPE_ATOMIC);quoted
quoted
quoted
quoted
@@ -59,7 +65,13 @@quoted
quoted
pipeline_queue_worker_single_stage_fwd(voidquoted
quoted
quoted
quoted
*arg)quoted
quoted
quoted
quoted
rte_event_eth_tx_adapter_txq_set(ev.mbuf, 0);quoted
quoted
quoted
quoted
pipeline_fwd_event(&ev, RTE_SCHED_TYPE_ATOMIC);quoted
quoted
quoted
quoted
pipeline_event_enqueue(dev, port, &ev);quoted
quoted
quoted
quoted
- w->processed_pkts++;quoted
quoted
quoted
quoted
+quoted
quoted
quoted
quoted
+ /* release barrier here ensures stored + operationquoted
quoted
quoted
quoted
+ * of the event completes before the number + ofquoted
quoted
quoted
quoted
+ * processed pkts is visible to the main corequoted
quoted
quoted
quoted
+ */quoted
quoted
quoted
quoted
+ __atomic_fetch_add(&(w->processed_pkts), 1,quoted
quoted
quoted
quoted
+ + __ATOMIC_RELEASE);quoted
quoted
quoted
quoted
}quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
return 0;quoted
quoted
quoted
quoted
@@ -84,7 +96,13 @@quoted
quoted
quoted
quoted
pipeline_queue_worker_single_stage_burst_tx(void *arg)quoted
quoted
quoted
quoted
if (ev[i].sched_type ==quoted
quoted
quoted
quoted
RTE_SCHED_TYPE_ATOMIC) {quoted
quoted
quoted
quoted
pipeline_event_tx(dev, port, &ev[i]);quoted
quoted
quoted
quoted
ev[i].op = RTE_EVENT_OP_RELEASE;quoted
quoted
quoted
quoted
- w->processed_pkts++;quoted
quoted
quoted
quoted
+quoted
quoted
quoted
quoted
+ /* release + barrier here ensures storedquoted
quoted
quoted
quoted
operationquoted
quoted
quoted
quoted
+ * of the + event completes before thequoted
quoted
quoted
quoted
number ofquoted
quoted
quoted
quoted
+ * processed + pkts is visible to the mainquoted
quoted
quoted
quoted
corequoted
quoted
quoted
quoted
+ */quoted
quoted
quoted
quoted
+ + __atomic_fetch_add(&(w-quoted
quoted
quoted
quoted
quoted
processed_pkts), 1,quoted
quoted
quoted
quoted
+__ATOMIC_RELEASE);quoted
quoted
quoted
quoted
} else {quoted
quoted
quoted
quoted
ev[i].queue_id++;quoted
quoted
quoted
quoted
pipeline_fwd_event(&ev[i],quoted
quoted
quoted
quoted
@@ -121,7 +139,13 @@quoted
quoted
quoted
quoted
pipeline_queue_worker_single_stage_burst_fwd(void *arg)quoted
quoted
quoted
quoted
}quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
pipeline_event_enqueue_burst(dev, port, ev, nb_rx);quoted
quoted
quoted
quoted
- w->processed_pkts += nb_rx;quoted
quoted
quoted
quoted
+quoted
quoted
quoted
quoted
+ /* release barrier here ensures stored + operationquoted
quoted
quoted
quoted
+ * of the event completes before the number + ofquoted
quoted
quoted
quoted
+ * processed pkts is visible to the main corequoted
quoted
quoted
quoted
+ */quoted
quoted
quoted
quoted
+ __atomic_fetch_add(&(w->processed_pkts), + nb_rx,quoted
quoted
quoted
quoted
+ + __ATOMIC_RELEASE);quoted
quoted
quoted
quoted
}quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
return 0;quoted
quoted
quoted
quoted
@@ -146,7 +170,13 @@quoted
quoted
pipeline_queue_worker_multi_stage_tx(voidquoted
quoted
quoted
quoted
*arg)quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
if (ev.queue_id == tx_queue[ev.mbuf->port]) {quoted
quoted
quoted
quoted
pipeline_event_tx(dev, port, &ev);quoted
quoted
quoted
quoted
- w->processed_pkts++;quoted
quoted
quoted
quoted
+quoted
quoted
quoted
quoted
+ /* release barrier here + ensures storedoperationquoted
quoted
quoted
quoted
+ * of the event completes + before the numberofquoted
quoted
quoted
quoted
+ * processed pkts is visible + to the main corequoted
quoted
quoted
quoted
+ */quoted
quoted
quoted
quoted
+ + __atomic_fetch_add(&(w->processed_pkts),1,quoted
quoted
quoted
quoted
+ + __ATOMIC_RELEASE);quoted
quoted
quoted
quoted
continue;quoted
quoted
quoted
quoted
}quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
@@ -180,7 +210,13 @@quoted
quoted
quoted
quoted
pipeline_queue_worker_multi_stage_fwd(void *arg)quoted
quoted
quoted
quoted
ev.queue_id = tx_queue[ev.mbuf->port];quoted
quoted
quoted
quoted
rte_event_eth_tx_adapter_txq_set(ev.mbuf,0);quoted
quoted
quoted
quoted
pipeline_fwd_event(&ev,quoted
quoted
quoted
quoted
RTE_SCHED_TYPE_ATOMIC);quoted
quoted
quoted
quoted
- w->processed_pkts++;quoted
quoted
quoted
quoted
+quoted
quoted
quoted
quoted
+ /* release barrier here + ensures storedoperationquoted
quoted
quoted
quoted
+ * of the event completes + before the numberofquoted
quoted
quoted
quoted
+ * processed pkts is visible + to the main corequoted
quoted
quoted
quoted
+ */quoted
quoted
quoted
quoted
+ + __atomic_fetch_add(&(w->processed_pkts),1,quoted
quoted
quoted
quoted
+ + __ATOMIC_RELEASE);quoted
quoted
quoted
quoted
} else {quoted
quoted
quoted
quoted
ev.queue_id++;quoted
quoted
quoted
quoted
pipeline_fwd_event(&ev,quoted
quoted
quoted
quoted
sched_type_list[cq_id]);quoted
quoted
quoted
quoted
@@ -214,7 +250,13 @@quoted
quoted
quoted
quoted
pipeline_queue_worker_multi_stage_burst_tx(void *arg)quoted
quoted
quoted
quoted
if (ev[i].queue_id == tx_queue[ev[i].mbuf-quoted
quoted
quoted
quoted
quoted
port]) {quoted
quoted
quoted
quoted
pipeline_event_tx(dev, port, &ev[i]);quoted
quoted
quoted
quoted
ev[i].op = RTE_EVENT_OP_RELEASE;quoted
quoted
quoted
quoted
- w->processed_pkts++;quoted
quoted
quoted
quoted
+quoted
quoted
quoted
quoted
+ /* release + barrier here ensures storedquoted
quoted
quoted
quoted
operationquoted
quoted
quoted
quoted
+ * of the + event completes before thequoted
quoted
quoted
quoted
number ofquoted
quoted
quoted
quoted
+ * processed + pkts is visible to the mainquoted
quoted
quoted
quoted
corequoted
quoted
quoted
quoted
+ */quoted
quoted
quoted
quoted
+ + __atomic_fetch_add(&(w-quoted
quoted
quoted
quoted
quoted
processed_pkts), 1,quoted
quoted
quoted
quoted
+__ATOMIC_RELEASE);quoted
quoted
quoted
quoted
continue;quoted
quoted
quoted
quoted
}quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
@@ -254,7 +296,13 @@quoted
quoted
quoted
quoted
pipeline_queue_worker_multi_stage_burst_fwd(void *arg)quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
rte_event_eth_tx_adapter_txq_set(ev[i].mbuf, 0);quoted
quoted
quoted
quoted
pipeline_fwd_event(&ev[i],quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
RTE_SCHED_TYPE_ATOMIC);quoted
quoted
quoted
quoted
- w->processed_pkts++;quoted
quoted
quoted
quoted
+quoted
quoted
quoted
quoted
+ /* release + barrier here ensures storedquoted
quoted
quoted
quoted
operationquoted
quoted
quoted
quoted
+ * of the + event completes before thequoted
quoted
quoted
quoted
number ofquoted
quoted
quoted
quoted
+ * processed + pkts is visible to the mainquoted
quoted
quoted
quoted
corequoted
quoted
quoted
quoted
+ */quoted
quoted
quoted
quoted
+ + __atomic_fetch_add(&(w-quoted
quoted
quoted
quoted
quoted
processed_pkts), 1,quoted
quoted
quoted
quoted
+__ATOMIC_RELEASE);quoted
quoted
quoted
quoted
} else {quoted
quoted
quoted
quoted
ev[i].queue_id++;quoted
quoted
quoted
quoted
pipeline_fwd_event(&ev[i],quoted
quoted
quoted
quoted
--quoted
quoted
quoted
quoted
2.17.1