Thread (39 messages) 39 messages, 5 authors, 2015-02-27

Re: [RFC PATCH 00/29] Phase 2 of fib_trie updates

From: Alexander Duyck <hidden>
Date: 2015-02-25 05:12:50

On 02/24/2015 07:53 PM, David Miller wrote:
From: Alexander Duyck <redacted>
Date: Tue, 24 Feb 2015 12:47:55 -0800
quoted
This patch series implements the second phase of the fib_trie changes.  I
presented on these and the previous changes at Netdev01 and netconf.  The
slides for the Netdev01 presentation can be found at
https://www.netdev01.org/docs/duyck-fib-trie.pdf.

I'm currently debating if I should just submit the entire patch-set as-is
or if I should hold off on submitting the last 10 patches as they currently
have a potential performance impact in the case of a large number of
entries placed in the local table.  Specifically I have seen that removing
an interface in the case of 8K local subnets being configured on it
resulted in the time for a dummy interface being removed increasing from
about .6 seconds to 2.4 seconds.  I am not sure how common of a use-case
something like this would be.  I have not seen the same issue if I assign
8K routes to the interface as I believe the fib_table_flush aggregates them
all in to one resize action.

The entire series reduces the total look-up time by another 20-35% versus
what is currently in the 4.0-rc1 kernel.  So for example a set of routing
look-ups which took 140ns in the 4.0-rc1 kernel will now only take about
105ns after these patches.
I did a quick once-over for these changes and conceptually they look
fine.

Why are sequences of removals so much more costly now?  Is it because
of the maintainence of the information in the parent when rebalancing?

In any event, I'll say two things:

1) You should submit these changes in smaller batches anyways.
   It's easier to review and get small sets of transformations
   tested as a unit.
Yeah, these will probably be submitted as 3 sets.  The first being the
leaf_info removal, then the key_vector stuff, and finally reworking the
RCU and pushing everything up one level so the pointer and key info
occupy the same cache line.
2) For the device removal case, we can batch the inet addr removal
   based route delete operations, and thus mitigate the rebalancing
   costs.
The problem is that the tnodes are now split over 2 cache lines.  As a
result in order to resize a node, or replace it with the leaf contained
in the node you end up having to replace the parent of the node as well. 

As it turns out dropping a subnet from the local trie occurs in two
steps.  The first appears to drop the broadcast addresses and flush
them, this is causing some significant overhead since it means the
kernel to reallocate the 8K child tnode as each subnet/child is
collapsing from a 4 child tnode to just a leaf.  Then it looks like the
kernel is going though and deleting the local addresses that were there
for each subnet one at a time.  This was much cheaper in the old setup
since it was just a matter of swapping a pointer instead of having to
update a pointer and key information.

- Alex
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help