Thread (37 messages) 37 messages, 5 authors, 2019-10-29

Re: [ovs-dev] [PATCH net-next v4 05/10] net: openvswitch: optimize flow-mask looking up

From: Tonghao Zhang <hidden>
Date: 2019-10-21 05:01:31

On Sat, Oct 19, 2019 at 7:27 AM William Tu [off-list ref] wrote:
On Wed, Oct 16, 2019 at 5:54 AM [off-list ref] wrote:
quoted
From: Tonghao Zhang <redacted>

The full looking up on flow table traverses all mask array.
If mask-array is too large, the number of invalid flow-mask
increase, performance will be drop.

One bad case, for example: M means flow-mask is valid and NULL
of flow-mask means deleted.

+-------------------------------------------+
| M | NULL | ...                  | NULL | M|
+-------------------------------------------+

In that case, without this patch, openvswitch will traverses all
mask array, because there will be one flow-mask in the tail. This
patch changes the way of flow-mask inserting and deleting, and the
mask array will be keep as below: there is not a NULL hole. In the
fast path, we can "break" "for" (not "continue") in flow_lookup
when we get a NULL flow-mask.

         "break"
            v
+-------------------------------------------+
| M | M |  NULL |...           | NULL | NULL|
+-------------------------------------------+

This patch don't optimize slow or control path, still using ma->max
to traverse. Slow path:
* tbl_mask_array_realloc
* ovs_flow_tbl_lookup_exact
* flow_mask_find

Hi Tonghao,

Does this improve performance? After all, the original code simply
check whether the mask is NULL, then goto next mask.
I tested the performance, but I disable the mask cache, and use the
dpdk-pktgen to generate packets:
The test ovs flow:
ovs-dpctl add-dp system@ovs-system
ovs-dpctl add-if system@ovs-system eth6
ovs-dpctl add-if system@ovs-system eth7

for m in $(seq 1 100 | xargs printf '%.2x\n'); do
       ovs-dpctl add-flow ovs-system
"in_port(1),eth(dst=00:$m:00:00:00:00/ff:ff:ff:ff:ff:$m),eth_type(0x0800),ipv4(frag=no)"
2
done

ovs-dpctl add-flow ovs-system
"in_port(1),eth(dst=98:03:9b:6e:4a:f5/ff:ff:ff:ff:ff:ff),eth_type(0x0800),ipv4(frag=no)"
2
ovs-dpctl add-flow ovs-system
"in_port(2),eth(dst=98:03:9b:6e:4a:f4/ff:ff:ff:ff:ff:ff),eth_type(0x0800),ipv4(frag=no)"
1

for m in $(seq 101 160 | xargs printf '%.2x\n'); do
        ovs-dpctl add-flow ovs-system
"in_port(1),eth(dst=00:$m:00:00:00:00/ff:ff:ff:ff:ff:$m),eth_type(0x0800),ipv4(frag=no)"
2
done

for m in $(seq 1 100 | xargs printf '%.2x\n'); do
         ovs-dpctl del-flow ovs-system
"in_port(1),eth(dst=00:$m:00:00:00:00/ff:ff:ff:ff:ff:$m),eth_type(0x0800),ipv4(frag=no)"
done

Without this patch: 982481pps (64B)
With this patch: 1112495 pps (64B), about 13% improve
However, with your patch,  isn't this invalidated the mask cache entry which
point to the "M" you swap to the front? See my commands inline.
quoted
Signed-off-by: Tonghao Zhang <redacted>
Tested-by: Greg Rose <redacted>
---
 net/openvswitch/flow_table.c | 101 ++++++++++++++++++++++---------------------
 1 file changed, 52 insertions(+), 49 deletions(-)
diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 8d4f50d..a10d421 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -538,7 +538,7 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl,

                mask = rcu_dereference_ovsl(ma->masks[i]);
                if (!mask)
-                       continue;
+                       break;

                flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
                if (flow) { /* Found */
@@ -695,7 +695,7 @@ struct sw_flow *ovs_flow_tbl_lookup_ufid(struct flow_table *tbl,
 int ovs_flow_tbl_num_masks(const struct flow_table *table)
 {
        struct mask_array *ma = rcu_dereference_ovsl(table->mask_array);
-       return ma->count;
+       return READ_ONCE(ma->count);
 }

 static struct table_instance *table_instance_expand(struct table_instance *ti,
@@ -704,21 +704,33 @@ static struct table_instance *table_instance_expand(struct table_instance *ti,
        return table_instance_rehash(ti, ti->n_buckets * 2, ufid);
 }

-static void tbl_mask_array_delete_mask(struct mask_array *ma,
-                                      struct sw_flow_mask *mask)
+static void tbl_mask_array_del_mask(struct flow_table *tbl,
+                                   struct sw_flow_mask *mask)
 {
-       int i;
+       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
+       int i, ma_count = READ_ONCE(ma->count);

        /* Remove the deleted mask pointers from the array */
-       for (i = 0; i < ma->max; i++) {
-               if (mask == ovsl_dereference(ma->masks[i])) {
-                       RCU_INIT_POINTER(ma->masks[i], NULL);
-                       ma->count--;
-                       kfree_rcu(mask, rcu);
-                       return;
-               }
+       for (i = 0; i < ma_count; i++) {
+               if (mask == ovsl_dereference(ma->masks[i]))
+                       goto found;
        }
+
        BUG();
+       return;
+
+found:
+       WRITE_ONCE(ma->count, ma_count -1);
+
+       rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
+       RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
So when you swap the ma->masks[ma_count -1], the mask cache entry
who's 'mask_index == ma_count' become all invalid?
Yes, a little tricky.
Regards,
William

<snip>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help