Thread (4 messages) 4 messages, 3 authors, 2012-08-20

Re: [PATCH v2 06/11] memcg: kmem controller infrastructure

From: Greg Thelen <hidden>
Date: 2012-08-15 16:39:03
Also in: linux-mm, lkml

On Wed, Aug 15 2012, Glauber Costa wrote:
On 08/14/2012 10:58 PM, Greg Thelen wrote:
quoted
On Mon, Aug 13 2012, Glauber Costa wrote:
quoted
quoted
quoted
quoted
+	WARN_ON(mem_cgroup_is_root(memcg));
+	size = (1 << order) << PAGE_SHIFT;
+	memcg_uncharge_kmem(memcg, size);
+	mem_cgroup_put(memcg);
Why do we need ref-counting here ? kmem res_counter cannot work as
reference ?
This is of course the pair of the mem_cgroup_get() you commented on
earlier. If we need one, we need the other. If we don't need one, we
don't need the other =)

The guarantee we're trying to give here is that the memcg structure will
stay around while there are dangling charges to kmem, that we decided
not to move (remember: moving it for the stack is simple, for the slab
is very complicated and ill-defined, and I believe it is better to treat
all kmem equally here)
By keeping memcg structures hanging around until the last referring kmem
page is uncharged do such zombie memcg each consume a css_id and thus
put pressure on the 64k css_id space?  I imagine in pathological cases
this would prevent creation of new cgroups until these zombies are
dereferenced.
Yes, but although this patch makes it more likely, it doesn't introduce
that. If the tasks, for instance, grab a reference to the cgroup dentry
in the filesystem (like their CWD, etc), they will also keep the cgroup
around.
Fair point.  But this doesn't seems like a feature.  It's probably not
needed initially, but what do you think about creating a
memcg_kernel_context structure which is allocated when memcg is
allocated?  Kernel pages charged to a memcg would have
page_cgroup->mem_cgroup=memcg_kernel_context rather than memcg.  This
would allow the mem_cgroup and its css_id to be deleted when the cgroup
is unlinked from cgroupfs while allowing for the active kernel pages to
continue pointing to a valid memcg_kernel_context.  This would be a
reference counted structure much like you are doing with memcg.  When a
memcg is deleted the memcg_kernel_context would be linked into its
surviving parent memcg.  This would avoid needing to visit each kernel
page.
quoted
Is there any way to see how much kmem such zombie memcg are consuming?
I think we could find these with
for_each_mem_cgroup_tree(root_mem_cgroup).
Yes, just need an interface for that. But I think it is something that
can be addressed orthogonaly to this work, in a separate patch, not as
some fundamental limitation.
Agreed.
quoted
Basically, I'm wanting to know where kernel memory has been
allocated.  For live memcg, an admin can cat
memory.kmem.usage_in_bytes.  But for zombie memcg, I'm not sure how
to get this info.  It looks like the root_mem_cgroup
memory.kmem.usage_in_bytes is not hierarchically charged.
Not sure what you mean by not being hierarchically charged. It should
be, when use_hierarchy = 1. As a matter of fact, I just tested it, and I
do see kmem being charged all the way to the root cgroup when hierarchy
is used. (we just can't limit it there)
You're correct, my mistake.

I think the procedure to determine out the amount of zombie kmem is:
  root_mem_cgroup.kmem_usage_in_bytes - 
    sum(all top level memcg memory.kmem_usage_in_bytes)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help