Thread (27 messages) 27 messages, 4 authors, 2025-08-15

Re: [PATCH v4 net-next 01/10] mptcp: Fix up subflow's memcg when CONFIG_SOCK_CGROUP_DATA=n.

From: Kuniyuki Iwashima <kuniyu@google.com>
Date: 2025-08-15 02:31:23
Also in: linux-mm, mptcp, netdev

On Thu, Aug 14, 2025 at 6:06 PM Shakeel Butt [off-list ref] wrote:
On Thu, Aug 14, 2025 at 05:05:56PM -0700, Kuniyuki Iwashima wrote:
quoted
On Thu, Aug 14, 2025 at 4:46 PM Shakeel Butt [off-list ref] wrote:
quoted
On Thu, Aug 14, 2025 at 04:27:31PM -0700, Kuniyuki Iwashima wrote:
quoted
On Thu, Aug 14, 2025 at 2:44 PM Shakeel Butt [off-list ref] wrote:
quoted
On Thu, Aug 14, 2025 at 08:08:33PM +0000, Kuniyuki Iwashima wrote:
quoted
When sk_alloc() allocates a socket, mem_cgroup_sk_alloc() sets
sk->sk_memcg based on the current task.

MPTCP subflow socket creation is triggered from userspace or
an in-kernel worker.

In the latter case, sk->sk_memcg is not what we want.  So, we fix
it up from the parent socket's sk->sk_memcg in mptcp_attach_cgroup().

Although the code is placed under #ifdef CONFIG_MEMCG, it is buried
under #ifdef CONFIG_SOCK_CGROUP_DATA.

The two configs are orthogonal.  If CONFIG_MEMCG is enabled without
CONFIG_SOCK_CGROUP_DATA, the subflow's memory usage is not charged
correctly.

Let's wrap sock_create_kern() for subflow with set_active_memcg()
using the parent sk->sk_memcg.

Fixes: 3764b0c5651e3 ("mptcp: attach subflow socket to parent cgroup")
Suggested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 mm/memcontrol.c     |  5 ++++-
 net/mptcp/subflow.c | 11 +++--------
 2 files changed, 7 insertions(+), 9 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8dd7fbed5a94..450862e7fd7a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5006,8 +5006,11 @@ void mem_cgroup_sk_alloc(struct sock *sk)
      if (!in_task())
              return;

+     memcg = current->active_memcg;
+
Use active_memcg() instead of current->active_memcg and do before the
!in_task() check.
Why not reuse the !in_task() check here ?
We never use int_active_memcg for socket and also
know int_active_memcg is always NULL here.
If we are making mem_cgroup_sk_alloc() work with set_active_memcg()
infra then make it work for both in_task() and !in_task() contexts.
Considering e876ecc67db80, then I think we should add
set_active_memcg_in_task() and active_memcg_in_task().

or at least we need WARN_ON() if we want to place active_memcg()
before the in_task() check, but this looks ugly.

        memcg = active_memcg();
        if (!in_task() && !memcg)
                return;
        DEBUG_NET_WARN_ON_ONCE(!in_task() && memcg))
You don't have to use the code as is. It is just an example. Basically I
am asking if in future someone does the following:

        // in !in_task() context
        old_memcg = set_active_memcg(new_memcg);
        sk = sk_alloc();
        set_active_memcg(old_memcg);

mem_cgroup_sk_alloc() should work and associate the sk with the
new_memcg.

You can manually inline active_memcg() function to avoid multiple
in_task() checks like below:
Will do so, thanks!

void mem_cgroup_sk_alloc(struct sock *sk)
{
        struct mem_cgroup *memcg;

        if (!mem_cgroup_sockets_enabled)
                return;

        if (!in_task()) {
                memcg = this_cpu_read(int_active_memcg);

                /*
                 * Do not associate the sock with unrelated interrupted
                 * task's memcg.
                 */
                if (!memcg)
                        return;
        } else {
                memcg = current->active_memcg;
        }

        rcu_read_lock();
        if (likely(!memcg))
                memcg = mem_cgroup_from_task(current);
        if (mem_cgroup_is_root(memcg))
                goto out;
        if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && !memcg1_tcpmem_active(memcg))
                goto out;
        if (css_tryget(&memcg->css))
                sk->sk_memcg = memcg;
out:
        rcu_read_unlock();
}
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help