Re: [PATCH net-next] net: neigh: avoid calling neigh_forced_gc on every alloc when table is full
From: Vimal Agrawal <hidden>
Date: 2026-06-29 07:57:15
Hi Kuniyuki,
Thank you for the feedback.
However, the rate limiting issue exists independently of the threshold
values. If entries genuinely exceed gc_thresh3 — regardless of what it
is set to — neigh_forced_gc() is called on every allocation attempt
with no rate limiting. In my workload, most entries are
active/reachable with refcnt > 1, so the GC walk traverses the entire
table without reclaiming anything. Increasing gc_thresh3 would make
this worse, not better, as GC now has a larger table to scan on each
call.
Regarding neigh_hash_shift: in my workload, neigh_alloc() returns
ENOBUFS before reaching do_alloc() since GC cannot reclaim any
entries. kzalloc() is never called, so neigh_hash_grow() is not
involved in the latency I observed. The pre-lock time check in
neigh_forced_gc() is a low-cost safeguard that prevents repeated full
table scans regardless of gc_thresh3 value. It does not interfere with
correct GC behaviour — if entries are still above the threshold, GC
runs normally.
Hi Jakub,
I tested with different threshold values, filling the table completely
with 32k reachable entries and attempting 1000 additional allocations.
Exported neigh_forced_gc so that it can be profiled
no change 10ms 50ms 100ms
max cpu usage % 44% 11.8% 2.56% 1.42%
calls > 100us (of 1000) 101 31 13 7
At 10ms, max CPU usage is still 11.8% and 31 out of 1000 calls take
more than 100us. Given that 50ms reduces this to 2.56% and 13 calls
respectively, I would prefer 50ms as the threshold. However, I am open
to further discussion on the right value.
Thanks,
Vimal
On Fri, Jun 26, 2026 at 3:17 AM Kuniyuki Iwashima [off-list ref] wrote:quoted hunk ↗ jump to hunk
From: Vimal Agrawal <redacted> Date: Thu, 25 Jun 2026 10:20:20 +0000quoted
Once the neighbour table exceeds gc_thresh3, neigh_forced_gc() is called on every allocation attempt with no rate limiting. In workloads with mostly active/reachable entries, the GC walk traverses a large portion of the neighbour table without reclaiming entries, holding tbl->lock for an extended period. This causes severe lock contention and allocation latencies exceeding 16ms under sustained neighbour creation. Add a pre-lock check in neigh_forced_gc() to skip the GC run if one was performed within the last second, avoiding repeated full table scans and lock acquisitions on the hot allocation path. Profiling of neigh_create() shows ~3 orders of magnitude latency improvement with this change. Link:https://lore.kernel.org/netdev/CALkUMdSCpx_ywYCx_ePLdm6yioO1nQWx7sSM=AEgsq0kywHxTw@mail.gmail.com/ (local)From the thread, these look misconfigured. ---8<--- net.ipv6.neigh.default.gc_thresh2 = 32768 net.ipv6.neigh.default.gc_thresh3 = 32768 ---8<--- If gc_thresh3 is larger enough, gc_thresh2 will give you 5s rate limiting. If the number of active neigh entries constantly exceeds gc_thresh3, it will be the correct gc_thresh2 for you. Also, I guess you want a new kernel param for the first neigh_hash_alloc(), which is currently fixed for 3, which is too small for some hosts. 50000 entries require neigh_hash_grow() 13 times. Can you test this on your real workload, starting from neigh_hash_shift=16 and appropriate gc_thresh2/3 ? ---8<---diff --git a/net/core/neighbour.c b/net/core/neighbour.c index 1349c0eedb64..a75b3750eec9 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c@@ -1817,6 +1817,22 @@ EXPORT_SYMBOL(neigh_parms_release); static struct lock_class_key neigh_table_proxy_queue_class; static struct neigh_table __rcu *neigh_tables[NEIGH_NR_TABLES] __read_mostly; +static __initdata unsigned long neigh_hash_shift = 3; + +static int __init neigh_set_hash_shift(char *str) +{ + ssize_t ret; + + if (!str) + return 0; + + ret = kstrtoul(str, 0, &neigh_hash_shift); + if (ret) + return 0; + + return 1; +} +__setup("neigh_hash_shift=", neigh_set_hash_shift); void neigh_table_init(int index, struct neigh_table *tbl) {@@ -1843,7 +1859,7 @@ void neigh_table_init(int index, struct neigh_table *tbl) panic("cannot create neighbour proc dir entry"); #endif - RCU_INIT_POINTER(tbl->nht, neigh_hash_alloc(3)); + RCU_INIT_POINTER(tbl->nht, neigh_hash_alloc(neigh_hash_shift)); phsize = (PNEIGH_HASHMASK + 1) * sizeof(struct pneigh_entry *); tbl->phash_buckets = kzalloc(phsize, GFP_KERNEL); ---8<---quoted
Signed-off-by: Vimal Agrawal <redacted> --- net/core/neighbour.c | 3 +++ 1 file changed, 3 insertions(+)diff --git a/net/core/neighbour.c b/net/core/neighbour.c index 1349c0eedb64..078842db3c5f 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c@@ -260,6 +260,9 @@ static int neigh_forced_gc(struct neigh_table *tbl) int shrunk = 0; int loop = 0; + if (!time_after(jiffies, READ_ONCE(tbl->last_flush) + HZ)) + return 0; + NEIGH_CACHE_STAT_INC(tbl, forced_gc_runs); spin_lock_bh(&tbl->lock); --2.17.1 v