Thread (51 messages) 51 messages, 13 authors, 2012-11-26

Re: [RFC v3 0/3] vmpressure_fd: Linux VM pressure notifications

From: Glauber Costa <hidden>
Date: 2012-11-16 21:13:05
Also in: linux-mm, lkml

Hey,


On 11/17/2012 12:04 AM, David Rientjes wrote:
On Fri, 16 Nov 2012, Glauber Costa wrote:
quoted
My personal take:

Most people hate memcg due to the cost it imposes. I've already
demonstrated that with some effort, it doesn't necessarily have to be
so. (http://lwn.net/Articles/517634/)

The one thing I missed on that work, was precisely notifications. If you
can come up with a good notifications scheme that *lives* in memcg, but
does not *depend* in the memcg infrastructure, I personally think it
could be a big win.
This doesn't allow users of cpusets without memcg to have an API for 
memory pressure, that's why I thought it should be a new cgroup that can 
be mounted alongside any existing cgroup, any cgroup in the future, or 
just by itself.
quoted
Doing this in memcg has the advantage that the "per-group" vs "global"
is automatically solved, since the root memcg is just another name for
"global".
That's true of any cgroup.
Yes. But memcg happens to also deal with memory usage, and already have
a notification mechanism =)
quoted
I honestly like your low/high/oom scheme better than memcg's
"threshold-in-bytes". I would also point out that those thresholds are
*far* from exact, due to the stock charging mechanism, and can be wrong
by as much as O(#cpus). So far, nobody complained. So in theory it
should be possible to convert memcg to low/high/oom, while still
accepting writes in bytes, that would be thrown in the closest bucket.
I'm wondering if we should have more than three different levels.
In the case I outlined below, for backwards compatibility. What I
actually mean is that memcg *currently* allows arbitrary notifications.
One way to merge those, while moving to a saner 3-point notification, is
to still allow the old writes and fit them in the closest bucket.
quoted
Another thing from one of your e-mails, that may shift you in the memcg
direction:

"2. The last time I checked, cgroups memory controller did not (and I
guess still does not) not account kernel-owned slabs. I asked several
times why so, but nobody answered."

It should, now, in the latest -mm, although it won't do per-group
reclaim (yet).
Not sure where that was written, but I certainly didn't write it 
Indeed you didn't, Anton did. It's his proposal, so I actually meant him
everytime I said "you". The fact that you were the last responder made
it confusing - sorry.
and it's 
not really relevant in this discussion: memory pressure notifications 
would be triggered by reclaim when trying to allocate memory; why we need 
to reclaim or how we got into that state is tangential. 
My understanding is that one of the advantages he was pointing of his
mechanism over memcg, is that it would allow one to count slab memory as
well, which memcg won't do (it will, now).
quoted
I am also failing to see how cpusets would be involved in here. I
understand that you may have free memory in terms of size, but still be
further restricted by cpuset. But I also think that having multiple
entry points for this buy us nothing at all. So the choices I see are:
Umm, why do users of cpusets not want to be able to trigger memory 
pressure notifications?
Because cpusets only deal with memory placement, not memory usage.
And it is not that moving a task to cpuset disallows you to do any of
this: you could, as long as the same set of tasks are mounted in a
corresponding memcg.

Of course there are a couple use cases that could benefit from the
orthogonality, but I doubt it would justify the complexity in this case.




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help