Re: [PATCH v2 2/2] librte_ip_frag: add mbuf counter
From: Alex Kiselev <hidden>
Date: 2018-08-22 09:48:03
Hi Konstantin. Could we please make a final decision about counting mbufs, since It still feels to me like an unfinished business? Below are my final argumens. if they are not sound to you, just nack ;)
Hi Konstantin.quoted
Hi Alex, Sorry for delay in reply.
quoted
quoted
quoted
quoted
There might be situations (kind of attack when a lot of fragmented packets are sent to a dpdk application in order to flood the fragmentation table) when no additional mbufs must be added to the fragmentations table since it already contains to many of them. Currently there is no way to determine the number of mbufs holded int the fragmentation table. This patch allows to keep track of the number of mbufs holded in the fragmentation table.
quoted
quoted
quoted
I understand your intention, but still not sure it is worth it. My thought was that you can estimate by upper limit (num_entries * entries_per_bucket) or so.No, I can't. The estimation error might be so big that there would be no difference at all.
quoted
Not sure why? If you'll use upper limit, then worst thing could happen - you would start your table cleanup a bit earlier.Since bucket size is 4, an estimation error might be 400%. So, for example, if I want to setup the upper limit (max number mbufs that can be stored in frag table) to 20% of all my available mbufs I have to be ready that 80% of all mbufs might end up in a frag table (every bucket is full). Or if I take into account bucket size, and devide 20% by 4 in order the number mbufs to be exactly 20% in the worse case when every bucket is full, I could end up in the opposite border situation when exactly single mbuf is stored in every bucket, so upper limit of mbufs would be 20 / 4 = 5%. Both ways are not good since either you have to reserve extra mbufs just to correct estimation error or you upper limit would to small and you will be dropping good fragments.
quoted
quoted
quoted
Probably another way to account number of mbufs without changes in the lib - apply something like that(assuming that your fragmets are not multisegs):
quoted
quoted
quoted
uint32_t mbuf_in_frag_table = 0; ....
quoted
quoted
n= dr->>cnt;quoted
mb = rte_ipv4_frag_reassemble_packet(...); if (mb != NULL) mbuf_in_frag_table += mb->nb_segs; mbuf_in_frag_table += dr->cnt - n + 1;
quoted
Sorry, my bad, I think it should be mbuf_in_frag_table -= dr->cnt - n + 1;
quoted
quoted
quoted
In theory that could be applied even if fragments might be multisegs, but for that, we'll need to change rte_ip_frag_free_death_row() to return total number of freed segments.
quoted
quoted
That should be a little bit more complicated wrapper code:
quoted
quoted
uint32_t mbuf_in_frag_table = 0; ....
quoted
quoted
n= dr->cnt; reassembled_mbuf = rte_ipv4_frag_reassemble_packet(..., fragmented_mbuf, ...); if (reassembled_mbuf == NULL) mbuf_in_frag_table += fragmented_mbuf->nb_segs;
quoted
We don't know for sure here. fragmented_mbuf could be in death row by now.Yes. That's exactly why you have to keep track of mbufs here and later after rte_ip_frag_free_death_row().
User have to think about frag table and death row as a single entity, kind of a black box, since it's impossible to say where (in the frag table or in the death row) your mbuf will be after you call rte_ipv4_frag_reassemble_packet(). So, a caller/user should keep track of mbuf on every border/interface of that black box. One interface is rte_ipv4_frag_reassemble_packet and the other is rte_ip_frag_free_death_row.
So, that's why it's easier to keep track of mbufs inside the library.
quoted
quoted
else mbuf_in_frag_table -= reassembled_mbuf->nb_segs; mbuf_in_frag_table += dr->cnt - n;
quoted
quoted
Also, in that case every rte_ip_frag_free_death_row() needs a wrapper code too.
quoted
quoted
n= dr->cnt; rte_ip_frag_free_death_row(..) mbuf_in_frag_table += dr->cnt - n;
quoted
I don't think it is necessary. After packet is put in the death-row it is no longer in the table.It's critical, since from a user point of view death row and frag table is a black box due rte_ipv4_frag_reassemble_packet() doesn't indicate a caller where his packet has been stored (in the frag table or death row).
quoted
Konstantin
quoted
quoted
I think my approach is simplier.
quoted
quoted
quoted
Konstantin
quoted
quoted
quoted
quoted
Signed-off-by: Alex Kiselev <redacted> --- lib/librte_ip_frag/ip_frag_common.h | 16 +++++++++------- lib/librte_ip_frag/ip_frag_internal.c | 16 +++++++++------- lib/librte_ip_frag/rte_ip_frag.h | 18 +++++++++++++++++- lib/librte_ip_frag/rte_ip_frag_common.c | 1 + lib/librte_ip_frag/rte_ip_frag_version.map | 1 + lib/librte_ip_frag/rte_ipv4_reassembly.c | 2 +- lib/librte_ip_frag/rte_ipv6_reassembly.c | 2 +- 7 files changed, 39 insertions(+), 17 deletions(-)
quoted
quoted
quoted
quoted
diff --git a/lib/librte_ip_frag/ip_frag_common.h b/lib/librte_ip_frag/ip_frag_common.h index 0fdcc7d0f..9fe5c0559 100644 --- a/lib/librte_ip_frag/ip_frag_common.h +++ b/lib/librte_ip_frag/ip_frag_common.h@@ -32,15 +32,15 @@ #endif /* IP_FRAG_TBL_STAT */
quoted
quoted
quoted
quoted
/* internal functions declarations */ -struct rte_mbuf * ip_frag_process(struct ip_frag_pkt *fp, - struct rte_ip_frag_death_row *dr, struct rte_mbuf *mb, - uint16_t ofs, uint16_t len, uint16_t more_frags); +struct rte_mbuf *ip_frag_process(struct rte_ip_frag_tbl *tbl, + struct ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr, + struct rte_mbuf *mb, uint16_t ofs, uint16_t len, uint16_t more_frags);
quoted
quoted
quoted
quoted
-struct ip_frag_pkt * ip_frag_find(struct rte_ip_frag_tbl *tbl, +struct ip_frag_pkt *ip_frag_find(struct rte_ip_frag_tbl *tbl, struct rte_ip_frag_death_row *dr, const struct ip_frag_key *key, uint64_t tms);
quoted
quoted
quoted
quoted
-struct ip_frag_pkt * ip_frag_lookup(struct rte_ip_frag_tbl *tbl, +struct ip_frag_pkt *ip_frag_lookup(struct rte_ip_frag_tbl *tbl, const struct ip_frag_key *key, uint64_t tms, struct ip_frag_pkt **free, struct ip_frag_pkt **stale);
quoted
quoted
quoted
quoted
@@ -91,7 +91,8 @@ ip_frag_key_cmp(const struct ip_frag_key * k1, const struct ip_frag_key * k2)
quoted
quoted
quoted
quoted
/* put fragment on death row */ static inline void -ip_frag_free(struct ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr) +ip_frag_free(struct rte_ip_frag_tbl *tbl, struct ip_frag_pkt *fp, + struct rte_ip_frag_death_row *dr) { uint32_t i, k;
quoted
quoted
quoted
quoted
@@ -100,6 +101,7 @@ ip_frag_free(struct ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr) if (fp->frags[i].mb != NULL) { dr->row[k++] = fp->frags[i].mb; fp->frags[i].mb = NULL; + tbl->nb_mbufs--; } }
quoted
quoted
quoted
quoted
@@ -160,7 +162,7 @@ static inline void ip_frag_tbl_del(struct rte_ip_frag_tbl *tbl, struct rte_ip_frag_death_row *dr, struct ip_frag_pkt *fp) { - ip_frag_free(fp, dr); + ip_frag_free(tbl, fp, dr); ip_frag_key_invalidate(&fp->key); TAILQ_REMOVE(&tbl->lru, fp, lru); tbl->use_entries--;diff --git a/lib/librte_ip_frag/ip_frag_internal.c b/lib/librte_ip_frag/ip_frag_internal.c index 97470a872..4c47d3fb4 100644 --- a/lib/librte_ip_frag/ip_frag_internal.c +++ b/lib/librte_ip_frag/ip_frag_internal.c@@ -29,14 +29,13 @@ static inline void ip_frag_tbl_reuse(struct rte_ip_frag_tbl *tbl, struct rte_ip_frag_death_row *dr, struct ip_frag_pkt *fp, uint64_t tms) { - ip_frag_free(fp, dr); + ip_frag_free(tbl, fp, dr); ip_frag_reset(fp, tms); TAILQ_REMOVE(&tbl->lru, fp, lru); TAILQ_INSERT_TAIL(&tbl->lru, fp, lru); IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, reuse_num, 1); }
quoted
quoted
quoted
quoted
- static inline void ipv4_frag_hash(const struct ip_frag_key *key, uint32_t *v1, uint32_t *v2) {@@ -88,8 +87,9 @@ ipv6_frag_hash(const struct ip_frag_key *key, uint32_t *v1, uint32_t *v2) }
quoted
quoted
quoted
quoted
struct rte_mbuf * -ip_frag_process(struct ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr, - struct rte_mbuf *mb, uint16_t ofs, uint16_t len, uint16_t more_frags) +ip_frag_process(struct rte_ip_frag_tbl *tbl, struct ip_frag_pkt *fp, + struct rte_ip_frag_death_row *dr, struct rte_mbuf *mb, uint16_t ofs, + uint16_t len, uint16_t more_frags) { uint32_t idx;
quoted
quoted
quoted
quoted
@@ -147,7 +147,7 @@ ip_frag_process(struct ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr, fp->frags[IP_LAST_FRAG_IDX].len);
quoted
quoted
quoted
quoted
/* free all fragments, invalidate the entry. */ - ip_frag_free(fp, dr); + ip_frag_free(tbl, fp, dr); ip_frag_key_invalidate(&fp->key); IP_FRAG_MBUF2DR(dr, mb);
quoted
quoted
quoted
quoted
@@ -157,6 +157,7 @@ ip_frag_process(struct ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr, fp->frags[idx].ofs = ofs; fp->frags[idx].len = len; fp->frags[idx].mb = mb; + tbl->nb_mbufs++;
quoted
quoted
quoted
quoted
mb = NULL;
quoted
quoted
quoted
quoted
@@ -205,8 +206,9 @@ ip_frag_process(struct ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr, fp->frags[IP_LAST_FRAG_IDX].len);
quoted
quoted
quoted
quoted
/* free associated resources. */ - ip_frag_free(fp, dr); - } + ip_frag_free(tbl, fp, dr); + } else + tbl->nb_mbufs -= fp->last_idx;
quoted
quoted
quoted
quoted
/* we are done with that entry, invalidate it. */ ip_frag_key_invalidate(&fp->key);diff --git a/lib/librte_ip_frag/rte_ip_frag.h b/lib/librte_ip_frag/rte_ip_frag.h index 7f425f610..623934d87 100644 --- a/lib/librte_ip_frag/rte_ip_frag.h +++ b/lib/librte_ip_frag/rte_ip_frag.h@@ -96,6 +96,7 @@ struct rte_ip_frag_tbl { uint32_t bucket_entries; /**< hash associativity. */ uint32_t nb_entries; /**< total size of the table. */ uint32_t nb_buckets; /**< num of associativity lines. */ + uint32_t nb_mbufs; /**< num of mbufs holded in the tbl. */ struct ip_frag_pkt *last; /**< last used entry. */ struct ip_pkt_list lru; /**< LRU list for table entries. */ struct ip_frag_tbl_stat stat; /**< statistics counters. */@@ -329,8 +330,23 @@ void rte_ip_frag_table_statistics_dump(FILE * f, const struct rte_ip_frag_tbl *tbl);
quoted
quoted
quoted
quoted
/** - * Delete expired fragments + * Number of mbufs holded in the fragmentation table. + * + * @param tbl + * Fragmentation table * + * @return + * Number of mbufs holded in the fragmentation table. + */ +static inline uint32_t __rte_experimental +rte_frag_table_mbuf_count(const struct rte_ip_frag_tbl *tbl) +{ + return tbl->nb_mbufs; +} + +/** + * Delete expired fragments + * * @param tbl * Table to delete expired fragments from * @param drdiff --git a/lib/librte_ip_frag/rte_ip_frag_common.c b/lib/librte_ip_frag/rte_ip_frag_common.c index a23f6f24f..46c2df84a 100644 --- a/lib/librte_ip_frag/rte_ip_frag_common.c +++ b/lib/librte_ip_frag/rte_ip_frag_common.c@@ -75,6 +75,7 @@ rte_ip_frag_table_create(uint32_t bucket_num, uint32_t bucket_entries, tbl->nb_buckets = bucket_num; tbl->bucket_entries = bucket_entries; tbl->entry_mask = (tbl->nb_entries - 1) & ~(tbl->bucket_entries - 1); + tbl->nb_mbufs = 0;
quoted
quoted
quoted
quoted
TAILQ_INIT(&(tbl->lru)); return tbl;diff --git a/lib/librte_ip_frag/rte_ip_frag_version.map b/lib/librte_ip_frag/rte_ip_frag_version.map index d40d5515f..f4700f460 100644 --- a/lib/librte_ip_frag/rte_ip_frag_version.map +++ b/lib/librte_ip_frag/rte_ip_frag_version.map@@ -23,4 +23,5 @@ EXPERIMENTAL { global:
quoted
quoted
quoted
quoted
rte_frag_table_del_expired_entries; + rte_frag_table_mbuf_count; };diff --git a/lib/librte_ip_frag/rte_ipv4_reassembly.c b/lib/librte_ip_frag/rte_ipv4_reassembly.c index 4956b99ea..fbdfd860a 100644 --- a/lib/librte_ip_frag/rte_ipv4_reassembly.c +++ b/lib/librte_ip_frag/rte_ipv4_reassembly.c@@ -146,7 +146,7 @@ rte_ipv4_frag_reassemble_packet(struct rte_ip_frag_tbl *tbl,
quoted
quoted
quoted
quoted
/* process the fragmented packet. */ - mb = ip_frag_process(fp, dr, mb, ip_ofs, ip_len, ip_flag); + mb = ip_frag_process(tbl, fp, dr, mb, ip_ofs, ip_len, ip_flag); ip_frag_inuse(tbl, fp);
quoted
quoted
quoted
quoted
IP_FRAG_LOG(DEBUG, "%s:%d:\n"diff --git a/lib/librte_ip_frag/rte_ipv6_reassembly.c b/lib/librte_ip_frag/rte_ipv6_reassembly.c index db249fe60..dda5a57b7 100644 --- a/lib/librte_ip_frag/rte_ipv6_reassembly.c +++ b/lib/librte_ip_frag/rte_ipv6_reassembly.c@@ -186,7 +186,7 @@ rte_ipv6_frag_reassemble_packet(struct rte_ip_frag_tbl *tbl,
quoted
quoted
quoted
quoted
/* process the fragmented packet. */ - mb = ip_frag_process(fp, dr, mb, ip_ofs, ip_len, + mb = ip_frag_process(tbl, fp, dr, mb, ip_ofs, ip_len, MORE_FRAGS(frag_hdr->frag_data)); ip_frag_inuse(tbl, fp);
quoted
quoted
quoted
quoted
-- 2.16.1.windows.1
quoted
quoted
-- Alex
-- Alex