Thread (128 messages) 128 messages, 10 authors, 2017-07-10

Re: [PATCH v3 1/3] lib: add Generic Receive Offload API framework

From: Jiayu Hu <hidden>
Date: 2017-05-27 03:41:21

On Sat, May 27, 2017 at 07:10:21AM +0800, Ananyev, Konstantin wrote:
Hi Jiayu,
quoted
-----Original Message-----
From: Hu, Jiayu
Sent: Friday, May 26, 2017 8:26 AM
To: Ananyev, Konstantin <redacted>
Cc: dev@dpdk.org; Wiles, Keith <redacted>; yuanhan.liu@linux.intel.com
Subject: Re: [PATCH v3 1/3] lib: add Generic Receive Offload API framework

Hi Konstantin,

On Wed, May 24, 2017 at 08:38:25PM +0800, Ananyev, Konstantin wrote:
quoted
Hi Jiayu,
quoted
Hi Konstantin,

Thanks for your comments. My replies/questions are below.

BRs,
Jiayu

On Mon, May 22, 2017 at 05:19:19PM +0800, Ananyev, Konstantin wrote:
quoted
Hi Jiayu,
My comments/questions below.
Konstantin
quoted
For applications, DPDK GRO provides three external functions to
enable/disable GRO:
- rte_gro_init: initialize GRO environment;
- rte_gro_enable: enable GRO for a given port;
- rte_gro_disable: disable GRO for a given port.
Before using GRO, applications should explicitly call rte_gro_init to
initizalize GRO environment. After that, applications can call
rte_gro_enable to enable GRO and call rte_gro_disable to disable GRO for
specific ports.
I think this is too restrictive and wouldn't meet various user's needs.
User might want to:
- enable/disable GRO for particular RX queue
- or even setup different GRO types for different RX queues,
   i.e, - GRO over IPV4/TCP for queue 0, and  GRO over IPV6/TCP for queue 1, etc.
The reason for enabling/disabling GRO per-port instead of per-queue is that LINUX
controls GRO per-port. To control GRO per-queue indeed can provide more flexibility
to applications. But are there any scenarios that different queues of a port may
require different GRO control (i.e. GRO types and enable/disable GRO)?
I think yes.
quoted
quoted
- For various reasons, user might prefer not to use RX callbacks for various reasons,
  But invoke gro() manually at somepoint in his code.
An application-used GRO library can enable more flexibility to applications. Besides,
when perform GRO in ethdev layer or inside PMD drivers, it is an issue that
rte_eth_rx_burst returns actually received packet number or GROed packet number. And
the same issue happens in GSO, and even more seriously. This is because applications
, like VPP, always rely on the return value of rte_eth_tx_burst to decide further
operations. If applications can direcly call GRO/GSO libraries, this issue won't exist.
And DPDK is a library, which is not a holistic system like LINUX. We don't need to do
the same as LINUX. Therefore, maybe it's a better idea to directly provide SW
segmentation/reassembling libraries to applications.
quoted
- Many users would like to control size (number of flows/items per flow),
  max allowed packet size, max timeout, etc., for different GRO tables.
- User would need a way to flush all or only timeout packets from particular GRO tables.

So I think that API needs to extended to become be much more fine-grained.
Something like that:

struct rte_gro_tbl_param {
   int32_t socket_id;
   size_t max_flows;
   size_t max_items_per_flow;
   size_t max_pkt_size;
   uint64_t packet_timeout_cycles;
   <desired GRO types (IPV4_TCP | IPV6_TCP, ...)>
  <probably type specific params>
  ...
};

struct rte_gro_tbl;
strct rte_gro_tbl *rte_gro_tbl_create(const struct rte_gro_tbl_param *param);

void rte_gro_tbl_destroy(struct rte_gro_tbl *tbl);
Yes, I agree with you. It's necessary to provide more fine-grained control APIs to
applications. But what's 'packet_timeout_cycles' used for? Is it for TCP packets?
For any packets that sits in the gro_table for too long.
quoted
quoted
/*
 * process packets, might store some packets inside the GRO table,
 * returns number of filled entries in pkt[]
 */
uint32_t rte_gro_tbl_process(struct rte_gro_tbl *tbl, struct rte_mbuf *pkt[], uint32_t num);

/*
  * retirieves up to num timeouted packets from the table.
  */
uint32_t rtre_gro_tbl_timeout(struct rte_gro_tbl *tbl, uint64_t tmt, struct rte_mbuf *pkt[], uint32_t num);
Currently, we implement GRO as RX callback, whose processing logic is simple:
receive burst packets -> perform GRO -> return to application. GRO stops after
finishing processing received packets. If we provide rte_gro_tbl_timeout, when
and who will call it?
I mean the following scenario:
We receive a packet, find it is eligible for GRO and put it into gro_table
in expectation - there would be more packets for the same flow.
But it could happen that we would never (or for some long time) receive
any new packets for that flow.
So the first packet would never be delivered to the upper layer,
or delivered too late.
So we need a mechanism to extract such not merged packets
and push them to the upper layer.
My thought it would be application responsibility to call it from time to time.
That's actually another reason why I think we shouldn't use application
to always use RX callbacks for GRO.
Currently, we only provide one reassembly function, rte_gro_reassemble_burst,
which merges N inputted packets at a time. After finishing processing these
packets, it returns all of them and reset hashing tables. Therefore, there
are no packets in hashing tables after rte_gro_reassemble_burst returns.
Ok, sorry I missed that part with rte_hash_reset().
So gro_ressemble_burst() performs only inline processing on current input packets
and doesn't try to save packets for future merging, correct?
Yes.
Such approach indeed is much lightweight and doesn't require any extra timeouts and flush().
So my opinion let's keep it like that - nice and simple.
BTW, I think in that case we don't need any hashtables (or any other persistent strucures)at all.
What we need is just a set of GROs (tcp4, tpc6, etc.) we want to perform on given array of packets. 
Beside GRO types that are desired to perform, maybe it also needs max_pkt_size and
some GRO type specific information?
quoted
If we provide rte_gro_tbl_timeout, we also need to provide another reassmebly
function, like rte_gro_reassemble, which processes one given packet at a
time and won't reset hashing tables. Applications decide when to flush packets
in hashing tables. And rte_gro_tbl_timeout is one of the ways that can be used
to flush the packets. Do you mean that?
Yes, that's what I meant, but as I said above - I think your approach is probably
more preferable - it is much simpler and lightweight.
Konstantin
 
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help