Re: [PATCH v4 net-next 01/10] mptcp: Fix up subflow's memcg when CONFIG_SOCK_CGROUP_DATA=n.
From: Kuniyuki Iwashima <kuniyu@google.com>
Date: 2025-08-15 02:31:23
Also in:
linux-mm, mptcp, netdev
On Thu, Aug 14, 2025 at 6:06 PM Shakeel Butt [off-list ref] wrote:
On Thu, Aug 14, 2025 at 05:05:56PM -0700, Kuniyuki Iwashima wrote:quoted
On Thu, Aug 14, 2025 at 4:46 PM Shakeel Butt [off-list ref] wrote:quoted
On Thu, Aug 14, 2025 at 04:27:31PM -0700, Kuniyuki Iwashima wrote:quoted
On Thu, Aug 14, 2025 at 2:44 PM Shakeel Butt [off-list ref] wrote:quoted
On Thu, Aug 14, 2025 at 08:08:33PM +0000, Kuniyuki Iwashima wrote:quoted
When sk_alloc() allocates a socket, mem_cgroup_sk_alloc() sets sk->sk_memcg based on the current task. MPTCP subflow socket creation is triggered from userspace or an in-kernel worker. In the latter case, sk->sk_memcg is not what we want. So, we fix it up from the parent socket's sk->sk_memcg in mptcp_attach_cgroup(). Although the code is placed under #ifdef CONFIG_MEMCG, it is buried under #ifdef CONFIG_SOCK_CGROUP_DATA. The two configs are orthogonal. If CONFIG_MEMCG is enabled without CONFIG_SOCK_CGROUP_DATA, the subflow's memory usage is not charged correctly. Let's wrap sock_create_kern() for subflow with set_active_memcg() using the parent sk->sk_memcg. Fixes: 3764b0c5651e3 ("mptcp: attach subflow socket to parent cgroup") Suggested-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> --- mm/memcontrol.c | 5 ++++- net/mptcp/subflow.c | 11 +++-------- 2 files changed, 7 insertions(+), 9 deletions(-)diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 8dd7fbed5a94..450862e7fd7a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c@@ -5006,8 +5006,11 @@ void mem_cgroup_sk_alloc(struct sock *sk) if (!in_task()) return; + memcg = current->active_memcg; +Use active_memcg() instead of current->active_memcg and do before the !in_task() check.Why not reuse the !in_task() check here ? We never use int_active_memcg for socket and also know int_active_memcg is always NULL here.If we are making mem_cgroup_sk_alloc() work with set_active_memcg() infra then make it work for both in_task() and !in_task() contexts.Considering e876ecc67db80, then I think we should add set_active_memcg_in_task() and active_memcg_in_task(). or at least we need WARN_ON() if we want to place active_memcg() before the in_task() check, but this looks ugly. memcg = active_memcg(); if (!in_task() && !memcg) return; DEBUG_NET_WARN_ON_ONCE(!in_task() && memcg))You don't have to use the code as is. It is just an example. Basically I am asking if in future someone does the following: // in !in_task() context old_memcg = set_active_memcg(new_memcg); sk = sk_alloc(); set_active_memcg(old_memcg); mem_cgroup_sk_alloc() should work and associate the sk with the new_memcg. You can manually inline active_memcg() function to avoid multiple in_task() checks like below:
Will do so, thanks!
void mem_cgroup_sk_alloc(struct sock *sk)
{
struct mem_cgroup *memcg;
if (!mem_cgroup_sockets_enabled)
return;
if (!in_task()) {
memcg = this_cpu_read(int_active_memcg);
/*
* Do not associate the sock with unrelated interrupted
* task's memcg.
*/
if (!memcg)
return;
} else {
memcg = current->active_memcg;
}
rcu_read_lock();
if (likely(!memcg))
memcg = mem_cgroup_from_task(current);
if (mem_cgroup_is_root(memcg))
goto out;
if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && !memcg1_tcpmem_active(memcg))
goto out;
if (css_tryget(&memcg->css))
sk->sk_memcg = memcg;
out:
rcu_read_unlock();
}