Re: [net-next PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup... | netdev

[net-next PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75% · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 01/17] fib_trie: Update usage stats to be percpu instead of global variables · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 02/17] fib_trie: Make leaf and tnode more uniform · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 03/17] fib_trie: Merge tnode_free and leaf_free into node_free · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 04/17] fib_trie: Merge leaf into tnode · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 05/17] fib_trie: Optimize fib_table_lookup to avoid wasting time on loops/variables · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 06/17] fib_trie: Optimize fib_find_node · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 07/17] fib_trie: Optimize fib_table_insert · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 08/17] fib_trie: Update meaning of pos to represent unchecked bits · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 09/17] fib_trie: Use unsigned long for anything dealing with a shift by bits · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 10/17] fib_trie: Push rcu_read_lock/unlock to callers · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 11/17] fib_trie: Move resize to after inflate/halve · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 12/17] fib_trie: Add functions should_inflate and should_halve · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 13/17] fib_trie: Push assignment of child to parent down into inflate/halve · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 14/17] fib_trie: Push tnode flushing down to inflate/halve · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 15/17] fib_trie: inflate/halve nodes in a more RCU friendly way · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 16/17] fib_trie: Remove checks for index >= tnode_child_length from tnode_get_child · Alexander Duyck <hidden> · 2014-12-31
[net-next PATCH 17/17] fib_trie: Add tracking value for suffix length · Alexander Duyck <hidden> · 2014-12-31
Re: [net-next PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75% · David Miller <davem@davemloft.net> · 2014-12-31
Re: [net-next PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75% · Alexander Duyck <hidden> · 2015-01-01
Re: [net-next PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75% · David Miller <davem@davemloft.net> · 2015-01-02
Re: [net-next PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75% · Alexander Duyck <hidden> · 2015-01-02
Re: [net-next PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75% · David Miller <davem@davemloft.net> · 2015-01-02

Re: [net-next PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75%

From: Alexander Duyck <hidden>
Date: 2015-01-01 02:32:54

On 12/31/2014 03:46 PM, David Miller wrote:

From: Alexander Duyck <redacted>
Date: Wed, 31 Dec 2014 10:55:23 -0800

quoted

These patches are meant to address several performance issues I have seen 
in the fib_trie implementation, and fib_table_lookup specifically.  With 
these changes in place I have seen a reduction of up to 35 to 75% for the 
total time spent in fib_table_lookup depending on the type of search being 
performed.

...

quoted

Changes since RFC:
  Replaced this_cpu_ptr with correct call to this_cpu_inc in patch 1
  Changed test for leaf_info mismatch to (key ^ n->key) & li->mask_plen in patch 10

As before, this looks awesome.

Thanks.

All applied to net-next, thanks!

This knocks about 35 cpu cycles off of a lookup that ends up using the
default route on sparc64.  From about ~438 cycles to ~403.

Did that 438 value include both fib_table_lookup and check_leaf?  Just
curious as the overall gain seems smaller than what I have been seeing
on the x86 system I was testing with, but then again it could just be a
sparc64 thing.

I've started work on a second round of patches.  With any luck they
should be ready by the time the next net-next opens.  My hope is to cut
the look-up time by another 30 to 50%, though it will take some time as
I have to go though and drop the leaf_info structure, and look at
splitting the tnode in half to break the key/pos/bits and child pointer
dependency chain which will hopefully allow for a significant reduction
in memory read stalls.

I am also planning to take a look at addressing the memory waste that
occurs on nodes larger than 256 bytes due to the way kmalloc allocates
memory as powers of 2.  I'm thinking I might try encouraging the growth
of smaller nodes, and discouraging anything over 256 by implementing a
"truesize" type logic that can be used in the inflate/halve functions so
that the memory usage is more accurately reflected.

- Alex

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help