Thread (91 messages) 91 messages, 5 authors, 2017-03-23

Re: [PATCH v2 15/15] app/test: add unit tests for SW eventdev driver

From: Van Haaren, Harry <hidden>
Date: 2017-02-08 10:44:15

-----Original Message-----
From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
Sent: Wednesday, February 8, 2017 10:23 AM
To: Van Haaren, Harry <redacted>
Cc: dev@dpdk.org; Richardson, Bruce <redacted>; Hunt, David
[off-list ref]; nipun.gupta@nxp.com; hemant.agrawal@nxp.com; Eads, Gage
[off-list ref]
Subject: Re: [PATCH v2 15/15] app/test: add unit tests for SW eventdev driver
<snip>
 
Thanks for SW driver specific test cases. It provided me a good insight
of expected application behavior from SW driver perspective and in turn it created
some challenge in portable applications.

I would like highlight a main difference between the implementation and get a
consensus on how to abstract it?
Thanks for taking the time to detail your thoughts - the examples certainly help to get a better picture of the whole.

Based on existing header file, We can do event pipelining in two different ways
a) Flow-based event pipelining
b) queue_id based event pipelining

I will provide an example to showcase application flow in both modes.
Based on my understanding from SW driver source code, it supports only
queue_id based event pipelining. I guess, Flow based event pipelining will
work semantically with SW driver but it will be very slow.

I think, the reason for the difference is the capability of the context definition.
SW model the context is - queue_id
Cavium HW model the context is queue_id + flow_id + sub_event_type +
event_type

AFAIK, queue_id based event pipelining will work with NXP HW but I am not
sure about flow based event pipelining model with NXP HW. Appreciate any
input this?

In Cavium HW, We support both modes.

As an open question, Should we add a capability flag to advertise the supported
models and let application choose the model based on implementation capability. The
downside is, a small portion of stage advance code will be different but we
can reuse the STAGE specific application code(I think it a fair
trade off)

Bruce, Harry, Gage, Hemant, Nipun
Thoughts? Or any other proposal?

[HvH] Comments inline.

 
I will take an non trivial realworld NW use case show the difference.
A standard IPSec outbound processing will have minimum 4 to 5 stages

stage_0:
--------
a) Takes the pkts from ethdev and push to eventdev as
RTE_EVENT_OP_NEW
b) Some HW implementation, This will be done by HW. In SW implementation
it done by service cores

stage_1:(ORDERED)
------------------
a) Receive pkts from stage_0 in ORDERED flow and it process in parallel on N
of cores
b) Find a SA belongs that packet move to next stage for SA specific
outbound operations.Outbound processing starts with updating the
sequence number in the critical section and followed by packet encryption in
parallel.

stage_2(ATOMIC) based on SA
----------------------------
a) Update the sequence number and move to ORDERED sched_type for packet
encryption in parallel

stage_3(ORDERED) based on SA
----------------------------
a) Encrypt the packets in parallel
b) Do output route look-up and figure out tx port and queue to transmit
the packet
c) Move to ATOMIC stage based on tx port and tx queue_id to transmit
the packet _without_ losing the ingress ordering

stage_4(ATOMIC) based on tx port/tx queue
-----------------------------------------
a) enqueue the encrypted packet to ethdev tx port/tx_queue


1) queue_id based event pipelining
=================================

stage_1_work(assigned to event queue 1)# N ports/N cores establish
link to queue 1 through rte_event_port_link()

on_each_cores_linked_to_queue1(stage1)

[HvH] All worker cores can be linked to all stages - we do a lookup of what stage the work is based on the event->queue_id.

while(1)
{
                /* STAGE 1 processing */
                nr_events = rte_event_dequeue_burst(ev,..);
                if (!nr_events);
                                continue;

                sa = find_sa_from_packet(ev.mbuf);

                /* move to next stage(ATOMIC) */
                ev.event_type = RTE_EVENT_TYPE_CPU;
                ev.sub_event_type = 2;
                ev.sched_type = RTE_SCHED_TYPE_ATOMIC;
                ev.flow_id =  sa;
                ev.op = RTE_EVENT_OP_FORWARD;
                ev.queue_id = 2;
                /* move to stage 2(event queue 2) */
                rte_event_enqueue_burst(ev,..);
}

on_each_cores_linked_to_queue2(stage2)
while(1)
{
                /* STAGE 2 processing */
                nr_events = rte_event_dequeue_burst(ev,..);
                if (!nr_events);
			continue;

                sa_specific_atomic_processing(sa /* ev.flow_id */);/* seq number update in
critical section */

                /* move to next stage(ORDERED) */
                ev.event_type = RTE_EVENT_TYPE_CPU;
                ev.sub_event_type = 3;
                ev.sched_type = RTE_SCHED_TYPE_ORDERED;
                ev.flow_id =  sa;
                ev.op = RTE_EVENT_OP_FORWARD;
                ev.queue_id = 3;
                /* move to stage 3(event queue 3) */
                rte_event_enqueue_burst(ev,..);
}

on_each_cores_linked_to_queue3(stage3)
while(1)
{
                /* STAGE 3 processing */
                nr_events = rte_event_dequeue_burst(ev,..);
                if (!nr_events);
			continue;

                sa_specific_ordered_processing(sa /*ev.flow_id */);/* packets encryption in
parallel */

                /* move to next stage(ATOMIC) */
                ev.event_type = RTE_EVENT_TYPE_CPU;
                ev.sub_event_type = 4;
                ev.sched_type = RTE_SCHED_TYPE_ATOMIC;
		output_tx_port_queue = find_output_tx_queue_and_tx_port(ev.mbuff);
                ev.flow_id =  output_tx_port_queue;
                ev.op = RTE_EVENT_OP_FORWARD;
                ev.queue_id = 4;
                /* move to stage 4(event queue 4) */
                rte_event_enqueue_burst(ev,...);
}

on_each_cores_linked_to_queue4(stage4)
while(1)
{
                /* STAGE 4 processing */
                nr_events = rte_event_dequeue_burst(ev,..);
                if (!nr_events);
			continue;

		rte_eth_tx_buffer();
}

2) flow-based event pipelining
=============================

- No need to partition queues for different stages
- All the cores can operate on all the stages, Thus enables
automatic multicore scaling, true dynamic load balancing,

[HvH] The sw case is the same - all cores can map to all stages, the lookup for stage of work is the queue_id.

- Fairly large number of SA(kind of 2^16 to 2^20) can be processed in parallel
Something existing IPSec application has constraints on
http://dpdk.org/doc/guides-16.04/sample_app_ug/ipsec_secgw.html

on_each_worker_cores()
while(1)
{
	rte_event_dequeue_burst(ev,..)
	if (!nr_events);
		continue;

	/* STAGE 1 processing */
	if(ev.event_type == RTE_EVENT_TYPE_ETHDEV) {
		sa = find_it_from_packet(ev.mbuf);
		/* move to next stage2(ATOMIC) */
		ev.event_type = RTE_EVENT_TYPE_CPU;
		ev.sub_event_type = 2;
		ev.sched_type = RTE_SCHED_TYPE_ATOMIC;
		ev.flow_id =  sa;
		ev.op = RTE_EVENT_OP_FORWARD;
		rte_event_enqueue_burst(ev..);

	} else if(ev.event_type == RTE_EVENT_TYPE_CPU && ev.sub_event_type == 2) { /* stage 2 */

[HvH] In the case of software eventdev ev.queue_id is used instead of ev.sub_event_type - but this is the same lookup operation as mentioned above. I don't see a fundamental difference between these approaches?
		sa_specific_atomic_processing(sa /* ev.flow_id */);/* seq number update in critical
section */
		/* move to next stage(ORDERED) */
		ev.event_type = RTE_EVENT_TYPE_CPU;
		ev.sub_event_type = 3;
		ev.sched_type = RTE_SCHED_TYPE_ORDERED;
		ev.flow_id =  sa;
		ev.op = RTE_EVENT_OP_FORWARD;
		rte_event_enqueue_burst(ev,..);

	} else if(ev.event_type == RTE_EVENT_TYPE_CPU && ev.sub_event_type == 3) { /* stage 3 */

		sa_specific_ordered_processing(sa /* ev.flow_id */);/* like encrypting packets in
parallel */
		/* move to next stage(ATOMIC) */
		ev.event_type = RTE_EVENT_TYPE_CPU;
		ev.sub_event_type = 4;
		ev.sched_type = RTE_SCHED_TYPE_ATOMIC;
		output_tx_port_queue = find_output_tx_queue_and_tx_port(ev.mbuff);
		ev.flow_id =  output_tx_port_queue;
		ev.op = RTE_EVENT_OP_FORWARD;
		rte_event_enqueue_burst(ev,..);

	} else if(ev.event_type == RTE_EVENT_TYPE_CPU && ev.sub_event_type == 4) { /* stage 4 */
		rte_eth_tx_buffer();
	}
}

/Jerin
Cavium
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help