Re: [PATCH v3 2/7] socket: initial cgroup code.
From: Glauber Costa <hidden>
Date: 2011-09-21 19:00:42
Also in:
linux-mm, lkml
On 09/21/2011 03:47 PM, Greg Thelen wrote:
On Sun, Sep 18, 2011 at 5:56 PM, Glauber Costa[off-list ref] wrote:quoted
We aim to control the amount of kernel memory pinned at any time by tcp sockets. To lay the foundations for this work, this patch adds a pointer to the kmem_cgroup to the socket structure. Signed-off-by: Glauber Costa<redacted> CC: David S. Miller<davem@davemloft.net> CC: Hiroyouki Kamezawa<redacted> CC: Eric W. Biederman<redacted>...quoted
+void sock_update_memcg(struct sock *sk) +{ + /* right now a socket spends its whole life in the same cgroup */ + BUG_ON(sk->sk_cgrp); + + rcu_read_lock(); + sk->sk_cgrp = mem_cgroup_from_task(current); + + /* + * We don't need to protect against anything task-related, because + * we are basically stuck with the sock pointer that won't change, + * even if the task that originated the socket changes cgroups. + * + * What we do have to guarantee, is that the chain leading us to + * the top level won't change under our noses. Incrementing the + * reference count via cgroup_exclude_rmdir guarantees that. + */ + cgroup_exclude_rmdir(mem_cgroup_css(sk->sk_cgrp));This grabs a css_get() reference, which prevents rmdir (will return -EBUSY).
Yes. How long is this reference held? For the socket lifetime.
I wonder about the case where a process creates a socket in memcg M1 and later is moved into memcg M2. At that point an admin would expect to be able to 'rmdir M1'. I think this rmdir would return -EBUSY and I suspect it would be difficult for the admin to understand why the rmdir of M1 failed. It seems that to rmdir a memcg, an admin would have to kill all processes that allocated sockets while in M1. Such processes may not still be in M1.quoted
+ rcu_read_unlock(); +}
I agree. But also, don't see too much ways around it without implementing full task migration. Right now I am working under the assumption that tasks are long lived inside the cgroup. Migration potentially introduces some nasty locking problems in the mem_schedule path. Also, unless I am missing something, the memcg already has the policy of not carrying charges around, probably because of this very same complexity. True that at least it won't EBUSY you... But I think this is at least a way to guarantee that the cgroup under our nose won't disappear in the middle of our allocations. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>