Re: [PATCH v2 15/15] app/test: add unit tests for SW eventdev driver
From: Van Haaren, Harry <hidden>
Date: 2017-02-08 10:44:15
-----Original Message----- From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com] Sent: Wednesday, February 8, 2017 10:23 AM To: Van Haaren, Harry <redacted> Cc: dev@dpdk.org; Richardson, Bruce <redacted>; Hunt, David [off-list ref]; nipun.gupta@nxp.com; hemant.agrawal@nxp.com; Eads, Gage [off-list ref] Subject: Re: [PATCH v2 15/15] app/test: add unit tests for SW eventdev driver
<snip>
Thanks for SW driver specific test cases. It provided me a good insight of expected application behavior from SW driver perspective and in turn it created some challenge in portable applications. I would like highlight a main difference between the implementation and get a consensus on how to abstract it?
Thanks for taking the time to detail your thoughts - the examples certainly help to get a better picture of the whole.
Based on existing header file, We can do event pipelining in two different ways a) Flow-based event pipelining b) queue_id based event pipelining I will provide an example to showcase application flow in both modes. Based on my understanding from SW driver source code, it supports only queue_id based event pipelining. I guess, Flow based event pipelining will work semantically with SW driver but it will be very slow. I think, the reason for the difference is the capability of the context definition. SW model the context is - queue_id Cavium HW model the context is queue_id + flow_id + sub_event_type + event_type AFAIK, queue_id based event pipelining will work with NXP HW but I am not sure about flow based event pipelining model with NXP HW. Appreciate any input this? In Cavium HW, We support both modes. As an open question, Should we add a capability flag to advertise the supported models and let application choose the model based on implementation capability. The downside is, a small portion of stage advance code will be different but we can reuse the STAGE specific application code(I think it a fair trade off) Bruce, Harry, Gage, Hemant, Nipun Thoughts? Or any other proposal?
[HvH] Comments inline.
I will take an non trivial realworld NW use case show the difference. A standard IPSec outbound processing will have minimum 4 to 5 stages stage_0: -------- a) Takes the pkts from ethdev and push to eventdev as RTE_EVENT_OP_NEW b) Some HW implementation, This will be done by HW. In SW implementation it done by service cores stage_1:(ORDERED) ------------------ a) Receive pkts from stage_0 in ORDERED flow and it process in parallel on N of cores b) Find a SA belongs that packet move to next stage for SA specific outbound operations.Outbound processing starts with updating the sequence number in the critical section and followed by packet encryption in parallel. stage_2(ATOMIC) based on SA ---------------------------- a) Update the sequence number and move to ORDERED sched_type for packet encryption in parallel stage_3(ORDERED) based on SA ---------------------------- a) Encrypt the packets in parallel b) Do output route look-up and figure out tx port and queue to transmit the packet c) Move to ATOMIC stage based on tx port and tx queue_id to transmit the packet _without_ losing the ingress ordering stage_4(ATOMIC) based on tx port/tx queue ----------------------------------------- a) enqueue the encrypted packet to ethdev tx port/tx_queue 1) queue_id based event pipelining ================================= stage_1_work(assigned to event queue 1)# N ports/N cores establish link to queue 1 through rte_event_port_link() on_each_cores_linked_to_queue1(stage1)
[HvH] All worker cores can be linked to all stages - we do a lookup of what stage the work is based on the event->queue_id.
while(1)
{
/* STAGE 1 processing */
nr_events = rte_event_dequeue_burst(ev,..);
if (!nr_events);
continue;
sa = find_sa_from_packet(ev.mbuf);
/* move to next stage(ATOMIC) */
ev.event_type = RTE_EVENT_TYPE_CPU;
ev.sub_event_type = 2;
ev.sched_type = RTE_SCHED_TYPE_ATOMIC;
ev.flow_id = sa;
ev.op = RTE_EVENT_OP_FORWARD;
ev.queue_id = 2;
/* move to stage 2(event queue 2) */
rte_event_enqueue_burst(ev,..);
}
on_each_cores_linked_to_queue2(stage2)
while(1)
{
/* STAGE 2 processing */
nr_events = rte_event_dequeue_burst(ev,..);
if (!nr_events);
continue;
sa_specific_atomic_processing(sa /* ev.flow_id */);/* seq number update in
critical section */
/* move to next stage(ORDERED) */
ev.event_type = RTE_EVENT_TYPE_CPU;
ev.sub_event_type = 3;
ev.sched_type = RTE_SCHED_TYPE_ORDERED;
ev.flow_id = sa;
ev.op = RTE_EVENT_OP_FORWARD;
ev.queue_id = 3;
/* move to stage 3(event queue 3) */
rte_event_enqueue_burst(ev,..);
}
on_each_cores_linked_to_queue3(stage3)
while(1)
{
/* STAGE 3 processing */
nr_events = rte_event_dequeue_burst(ev,..);
if (!nr_events);
continue;
sa_specific_ordered_processing(sa /*ev.flow_id */);/* packets encryption in
parallel */
/* move to next stage(ATOMIC) */
ev.event_type = RTE_EVENT_TYPE_CPU;
ev.sub_event_type = 4;
ev.sched_type = RTE_SCHED_TYPE_ATOMIC;
output_tx_port_queue = find_output_tx_queue_and_tx_port(ev.mbuff);
ev.flow_id = output_tx_port_queue;
ev.op = RTE_EVENT_OP_FORWARD;
ev.queue_id = 4;
/* move to stage 4(event queue 4) */
rte_event_enqueue_burst(ev,...);
}
on_each_cores_linked_to_queue4(stage4)
while(1)
{
/* STAGE 4 processing */
nr_events = rte_event_dequeue_burst(ev,..);
if (!nr_events);
continue;
rte_eth_tx_buffer();
}
2) flow-based event pipelining
=============================
- No need to partition queues for different stages
- All the cores can operate on all the stages, Thus enables
automatic multicore scaling, true dynamic load balancing,[HvH] The sw case is the same - all cores can map to all stages, the lookup for stage of work is the queue_id.
- Fairly large number of SA(kind of 2^16 to 2^20) can be processed in parallel Something existing IPSec application has constraints on http://dpdk.org/doc/guides-16.04/sample_app_ug/ipsec_secgw.html on_each_worker_cores() while(1) { rte_event_dequeue_burst(ev,..) if (!nr_events); continue; /* STAGE 1 processing */ if(ev.event_type == RTE_EVENT_TYPE_ETHDEV) { sa = find_it_from_packet(ev.mbuf); /* move to next stage2(ATOMIC) */ ev.event_type = RTE_EVENT_TYPE_CPU; ev.sub_event_type = 2; ev.sched_type = RTE_SCHED_TYPE_ATOMIC; ev.flow_id = sa; ev.op = RTE_EVENT_OP_FORWARD; rte_event_enqueue_burst(ev..); } else if(ev.event_type == RTE_EVENT_TYPE_CPU && ev.sub_event_type == 2) { /* stage 2 */
[HvH] In the case of software eventdev ev.queue_id is used instead of ev.sub_event_type - but this is the same lookup operation as mentioned above. I don't see a fundamental difference between these approaches?
sa_specific_atomic_processing(sa /* ev.flow_id */);/* seq number update in critical
section */
/* move to next stage(ORDERED) */
ev.event_type = RTE_EVENT_TYPE_CPU;
ev.sub_event_type = 3;
ev.sched_type = RTE_SCHED_TYPE_ORDERED;
ev.flow_id = sa;
ev.op = RTE_EVENT_OP_FORWARD;
rte_event_enqueue_burst(ev,..);
} else if(ev.event_type == RTE_EVENT_TYPE_CPU && ev.sub_event_type == 3) { /* stage 3 */
sa_specific_ordered_processing(sa /* ev.flow_id */);/* like encrypting packets in
parallel */
/* move to next stage(ATOMIC) */
ev.event_type = RTE_EVENT_TYPE_CPU;
ev.sub_event_type = 4;
ev.sched_type = RTE_SCHED_TYPE_ATOMIC;
output_tx_port_queue = find_output_tx_queue_and_tx_port(ev.mbuff);
ev.flow_id = output_tx_port_queue;
ev.op = RTE_EVENT_OP_FORWARD;
rte_event_enqueue_burst(ev,..);
} else if(ev.event_type == RTE_EVENT_TYPE_CPU && ev.sub_event_type == 4) { /* stage 4 */
rte_eth_tx_buffer();
}
}
/Jerin
Cavium