Re: [RFC 4/4] net/ipv4/fib: Don't synchronise_rcu() every 512Kb
From: Dmitry Safonov <hidden>
Date: 2019-03-26 17:15:51
Also in:
lkml
Hi David, On 3/26/19 3:39 PM, David Ahern wrote:
On 3/26/19 9:30 AM, Dmitry Safonov wrote:quoted
Fib trie has a hard-coded sync_pages limit to call synchronise_rcu(). The limit is 128 pages or 512Kb (considering common case with 4Kb pages). Unfortunately, at Arista we have use-scenarios with full view software forwarding. At the scale of 100K and more routes even on 2 core boxes the hard-coded limit starts actively shooting in the leg: lockup detector notices that rtnl_lock is held for seconds. First reason is previously broken MAX_WORK, that didn't limit pending balancing work. While fixing it, I've noticed that the bottle-neck is actually in the number of synchronise_rcu() calls. I've tried to fix it with a patch to decrement number of tnodes in rcu callback, but it hasn't much affected performance. One possible way to "fix" it - provide another sysctl to control sync_pages, but in my POV it's nasty - exposing another realisation detail into user-space.well, that was accepted last week. ;-) commit 9ab948a91b2c2abc8e82845c0e61f4b1683e3a4f Author: David Ahern [off-list ref] Date: Wed Mar 20 09:18:59 2019 -0700 ipv4: Allow amount of dirty memory from fib resizing to be controllable Can you see how that change (should backport easily) affects your test case? From my perspective 16MB was the sweet spot.
Heh, I based on master, so haven't seen it yet.
I still wonder if it's good to expose it to userspace rather than
shrinker, but this probably should work for me - I'll test it in near days.
Thanks,
Dmitry