Re: [PATCH v3 04/13] kmem accounting basic infrastructure

[PATCH v3 00/13] kmem controller for memcg. · Glauber Costa <hidden> · 2012-09-18
[PATCH v3 02/13] memcg: Reclaim when more than one page needed. · Glauber Costa <hidden> · 2012-09-18
Re: [PATCH v3 02/13] memcg: Reclaim when more than one page needed. · Johannes Weiner <hannes@cmpxchg.org> · 2012-10-01
[PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-18
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-21
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-24
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Michal Hocko <hidden> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Michal Hocko <hidden> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Michal Hocko <hidden> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Michal Hocko <hidden> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Michal Hocko <hidden> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Mel Gorman <mgorman@suse.de> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Michal Hocko <hidden> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-30
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-30
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · James Bottomley <James.Bottomley@HansenPartnership.com> · 2012-09-30
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-30
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · James Bottomley <James.Bottomley@HansenPartnership.com> · 2012-09-30
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-10-01
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-10-01
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-10-01
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-10-03
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-10-01
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Michal Hocko <hidden> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Mel Gorman <mgorman@suse.de> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-30
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-10-01
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-10-03
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-10-04
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-10-06
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Michal Hocko <hidden> · 2012-09-27
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-30
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Michal Hocko <hidden> · 2012-10-01
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Tejun Heo <tj@kernel.org> · 2012-10-03
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Michal Hocko <hidden> · 2012-10-05
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Johannes Weiner <hannes@cmpxchg.org> · 2012-09-26
Re: [PATCH v3 04/13] kmem accounting basic infrastructure · Glauber Costa <hidden> · 2012-09-26
[PATCH v3 03/13] memcg: change defines to an enum · Glauber Costa <hidden> · 2012-09-18
Re: [PATCH v3 03/13] memcg: change defines to an enum · Johannes Weiner <hannes@cmpxchg.org> · 2012-10-01
Re: [PATCH v3 03/13] memcg: change defines to an enum · Glauber Costa <hidden> · 2012-10-02
[PATCH v3 08/13] res_counter: return amount of charges after res_counter_uncharge · Glauber Costa <hidden> · 2012-09-18
Re: [PATCH v3 08/13] res_counter: return amount of charges after res_counter_uncharge · Michal Hocko <hidden> · 2012-10-01
Re: [PATCH v3 08/13] res_counter: return amount of charges after res_counter_uncharge · Glauber Costa <hidden> · 2012-10-01
[PATCH v3 12/13] execute the whole memcg freeing in rcu callback · Glauber Costa <hidden> · 2012-09-18
Re: [PATCH v3 12/13] execute the whole memcg freeing in rcu callback · Tejun Heo <tj@kernel.org> · 2012-09-21
Re: [PATCH v3 12/13] execute the whole memcg freeing in rcu callback · Glauber Costa <hidden> · 2012-09-24
Re: [PATCH v3 12/13] execute the whole memcg freeing in rcu callback · Michal Hocko <hidden> · 2012-10-01
Re: [PATCH v3 12/13] execute the whole memcg freeing in rcu callback · Glauber Costa <hidden> · 2012-10-04
Re: [PATCH v3 12/13] execute the whole memcg freeing in rcu callback · Glauber Costa <hidden> · 2012-10-04
Re: [PATCH v3 12/13] execute the whole memcg freeing in rcu callback · Johannes Weiner <hannes@cmpxchg.org> · 2012-10-05
Re: [PATCH v3 12/13] execute the whole memcg freeing in rcu callback · Glauber Costa <hidden> · 2012-10-08
[PATCH v3 11/13] memcg: allow a memcg with kmem charges to be destructed. · Glauber Costa <hidden> · 2012-09-18
Re: [PATCH v3 11/13] memcg: allow a memcg with kmem charges to be destructed. · Michal Hocko <hidden> · 2012-10-01
[PATCH v3 06/13] memcg: kmem controller infrastructure · Glauber Costa <hidden> · 2012-09-18
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · JoonSoo Kim <hidden> · 2012-09-20
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · Glauber Costa <hidden> · 2012-09-21
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · JoonSoo Kim <hidden> · 2012-09-21
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · Michal Hocko <hidden> · 2012-09-26
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · Glauber Costa <hidden> · 2012-09-27
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · Michal Hocko <hidden> · 2012-09-27
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · Glauber Costa <hidden> · 2012-09-28
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · Tejun Heo <tj@kernel.org> · 2012-09-30
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · Glauber Costa <hidden> · 2012-10-01
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · Tejun Heo <tj@kernel.org> · 2012-10-03
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · Michal Hocko <hidden> · 2012-10-01
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · Michal Hocko <hidden> · 2012-10-01
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · Glauber Costa <hidden> · 2012-10-01
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · Michal Hocko <hidden> · 2012-10-01
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · Glauber Costa <hidden> · 2012-10-01
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · Michal Hocko <hidden> · 2012-10-01
Re: [PATCH v3 06/13] memcg: kmem controller infrastructure · Glauber Costa <hidden> · 2012-10-01
[PATCH v3 09/13] memcg: kmem accounting lifecycle management · Glauber Costa <hidden> · 2012-09-18
Re: [PATCH v3 09/13] memcg: kmem accounting lifecycle management · Michal Hocko <hidden> · 2012-10-01
Re: [PATCH v3 09/13] memcg: kmem accounting lifecycle management · Glauber Costa <hidden> · 2012-10-01
Re: [PATCH v3 09/13] memcg: kmem accounting lifecycle management · Michal Hocko <hidden> · 2012-10-01
Re: [PATCH v3 09/13] memcg: kmem accounting lifecycle management · Glauber Costa <hidden> · 2012-10-01
[PATCH v3 05/13] Add a __GFP_KMEMCG flag · Glauber Costa <hidden> · 2012-09-18
Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag · Rik van Riel <hidden> · 2012-09-18
Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag · Christoph Lameter <hidden> · 2012-09-18
Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag · Glauber Costa <hidden> · 2012-09-19
Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag · Christoph Lameter <hidden> · 2012-09-19
Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag · Mel Gorman <mgorman@suse.de> · 2012-09-27
Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag · Glauber Costa <hidden> · 2012-09-27
Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag · Johannes Weiner <hannes@cmpxchg.org> · 2012-10-01
[PATCH v3 13/13] protect architectures where THREAD_SIZE >= PAGE_SIZE against fork bombs · Glauber Costa <hidden> · 2012-09-18
Re: [PATCH v3 13/13] protect architectures where THREAD_SIZE >= PAGE_SIZE against fork bombs · Michal Hocko <hidden> · 2012-10-01
[PATCH v3 10/13] memcg: use static branches when code not in use · Glauber Costa <hidden> · 2012-09-18
Re: [PATCH v3 10/13] memcg: use static branches when code not in use · Michal Hocko <hidden> · 2012-10-01
Re: [PATCH v3 10/13] memcg: use static branches when code not in use · Glauber Costa <hidden> · 2012-10-01
[PATCH v3 07/13] mm: Allocate kernel pages to the right memcg · Glauber Costa <hidden> · 2012-09-18
Re: [PATCH v3 07/13] mm: Allocate kernel pages to the right memcg · Mel Gorman <mgorman@suse.de> · 2012-09-27
Re: [PATCH v3 07/13] mm: Allocate kernel pages to the right memcg · Glauber Costa <hidden> · 2012-09-28
Re: [PATCH v3 07/13] mm: Allocate kernel pages to the right memcg · Mel Gorman <mgorman@suse.de> · 2012-09-28
Re: [PATCH v3 07/13] mm: Allocate kernel pages to the right memcg · Michal Hocko <hidden> · 2012-09-27
[PATCH v3 01/13] memcg: Make it possible to use the stock for more than one page. · Glauber Costa <hidden> · 2012-09-18
Re: [PATCH v3 01/13] memcg: Make it possible to use the stock for more than one page. · Johannes Weiner <hannes@cmpxchg.org> · 2012-10-01

From: Glauber Costa <hidden>
Date: 2012-09-26 22:49:27
Also in: linux-mm, lkml

On 09/27/2012 02:11 AM, Johannes Weiner wrote:

On Thu, Sep 27, 2012 at 12:02:14AM +0400, Glauber Costa wrote:

quoted

On 09/26/2012 11:56 PM, Tejun Heo wrote:

quoted

Hello,

On Wed, Sep 26, 2012 at 11:46:37PM +0400, Glauber Costa wrote:

quoted

Besides not being part of cgroup core, and respecting very much both
cgroups' and basic sanity properties, kmem is an actual feature that
some people want, and some people don't. There is no reason to believe
that applications that want will live in the same environment with ones
that don't want.

I don't know.  It definitely is less crazy than .use_hierarchy but I
wouldn't say it's an inherently different thing.  I mean, what does it
even mean to have u+k limit on one subtree and not on another branch?
And we worry about things like what if parent doesn't enable it but
its chlidren do.

It is inherently different. To begin with, it actually contemplates two
use cases. It is not a work around.

The meaning is also very well defined. The meaning of having this
enabled in one subtree and not in other is: Subtree A wants to track
kernel memory. Subtree B does not. It's that, and never more than that.
There is no maybes and no buts, no magic knobs that makes it behave in a
crazy way.

If a children enables it but the parent does not, this does what every
tree does: enable it from that point downwards.

quoted

This is a feature which adds complexity.  If the feature is necessary
and justified, sure.  If not, let's please not and let's err on the
side of conservativeness.  We can always add it later but the other
direction is much harder.

I disagree. Having kmem tracking adds complexity. Having to cope with
the use case where we turn it on dynamically to cope with the "user page
only" use case adds complexity. But I see no significant complexity
being added by having it per subtree. Really.

Maybe not in code, but you are adding an extra variable into the
system.  "One switch per subtree" is more complex than "one switch."
Yes, the toggle is hidden behind setting the limit, but it's still a
toggle.  The use_hierarchy complexity comes not from the file that
enables it, but from the resulting semantics.

I didn't claim the complexity was in the code. I actually think the
other way around that you do, and claim that a global switch is more
complex than a per-subtree. All properties we have so far applies to
subtrees, due to cgroup's hierarchical nature. We have no global
switches like this so far, and adding one would just add a new concept
that wasn't here.

kmem accounting is expensive and we definitely want to allow enabling
it separately from traditional user memory accounting.  But I think
there is no good reason to not demand an all-or-nothing answer from
the admin; either he wants kmem tracking on a machine or not.  At
least you haven't presented a convincing case, IMO.

I don't think there is strong/any demand for per-node toggles, but
once we add this behavior, people will rely on it and expect kmem
tracking to stay local and we are stuck with it.  Adding it for the
reason that people will use it is a self-fulfilling prophecy.


I don't think this is a compatibility only switch. Much has been said in
the past about the problem of sharing. A lot of the kernel objects are
shared by nature, this is pretty much unavoidable. The answer we have
been giving to this inquiry, is that the workloads (us) interested
in kmem accounted tend to be quite local in their file accesses (and
other kernel objects as well).

It should be obvious that not all workloads are like this, and some of
them would actually prefer to have their umem limited only.

I really don't think, and correct me if I am wrong, that the problem
lays in "is there a use case for umem?", but rather, if they should be
allowed to coexist in a box.

And honestly, it seems to me totally reasonable to avoid restricting
people to run as many workloads they think they can in the same box.

quoted

You have the use_hierarchy fiasco in mind, and I do understand that you
are raising the flag and all that.

But think in terms of functionality: This thing here is a lot more
similar to swap than use_hierarchy. Would you argue that memsw should be
per-root ?

We actually do have a per-root flag that controls accounting for swap.

quoted

The reason why it shouldn't: Some people want to limit memory
consumption all the way to the swap, some people don't. Same with kmem.

That lies in the nature of the interface: we chose k & u+k rather than
u & u+k, so our memory.limit_in_bytes will necessarily include kmem,
while swap is not included there.  But I really doubt that there is a
strong case for turning on swap accounting intentionally and then
limiting memory+swap only on certain subtrees.  Where would be the
sense in that?

It makes absolute sense. Because until I go set
memory.memsw.limit_in_bytes, my subtree is not limited, which is
precisely what kmem does.

And the use cases for that are:

1) I, application A, want to use 2G of mem, and I can never swap
2) I, application B, want to use 2G of mem, but I am fine using extra 1G
in swap.

There are plenty of workloads in both the "can swap" and "can't swap"
category around.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help