Thread (66 messages) 66 messages, 8 authors, 2023-07-28

RE: [PATCH net-next 1/2] net: Keep sk->sk_forward_alloc as a proper size

From: Zhang, Cathy <hidden>
Date: 2023-05-09 10:40:03

-----Original Message-----
From: Paolo Abeni <pabeni@redhat.com>
Sent: Tuesday, May 9, 2023 5:51 PM
To: Zhang, Cathy <redacted>; edumazet@google.com;
davem@davemloft.net; kuba@kernel.org
Cc: Brandeburg, Jesse <redacted>; Srinivas, Suresh
[off-list ref]; Chen, Tim C [off-list ref]; You,
Lizhen [off-list ref]; eric.dumazet@gmail.com;
netdev@vger.kernel.org
Subject: Re: [PATCH net-next 1/2] net: Keep sk->sk_forward_alloc as a proper
size

On Sun, 2023-05-07 at 19:08 -0700, Cathy Zhang wrote:
quoted
Before commit 4890b686f408 ("net: keep sk->sk_forward_alloc as small
as possible"), each TCP can forward allocate up to 2 MB of memory and
tcp_memory_allocated might hit tcp memory limitation quite soon. To
reduce the memory pressure, that commit keeps sk->sk_forward_alloc as
small as possible, which will be less than 1 page size if
SO_RESERVE_MEM is not specified.

However, with commit 4890b686f408 ("net: keep sk->sk_forward_alloc as
small as possible"), memcg charge hot paths are observed while system
is stressed with a large amount of connections. That is because
sk->sk_forward_alloc is too small and it's always less than truesize,
sk->network handlers like tcp_rcv_established() should jump to
slow path more frequently to increase sk->sk_forward_alloc. Each
memory allocation will trigger memcg charge, then perf top shows the
following contention paths on the busy system.

    16.77%  [kernel]            [k] page_counter_try_charge
    16.56%  [kernel]            [k] page_counter_cancel
    15.65%  [kernel]            [k] try_charge_memcg
I'm guessing you hit memcg limits frequently. I'm wondering if it's just a
matter of tuning/reducing tcp limits in /proc/sys/net/ipv4/tcp_mem.
Hi Paolo,

Do you mean hitting the limit of "--memory" which set when start container?
If the memory option is not specified when init a container, cgroup2 will create
a memcg without memory limitation on the system, right? We've run test
without this setting, and the memcg charge hot paths also exist.

It seems that /proc/sys/net/ipv4/tcp_[wr]mem is not allowed to be changed by
a simple echo writing, but requires a change to /etc/sys.conf, I'm not sure if it
could be changed without stopping the running application.  Additionally, will
this type of change bring more deeper and complex impact of network stack,
compared to reclaim_threshold which is assumed to mostly affect of the memory
allocation paths? Considering about this, it's decided to add the reclaim_threshold
directly.
Cheers,

Paolo
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help