Re: [RFC 0/3] Implementation of cgroup isolation

[RFC 0/3] Implementation of cgroup isolation · Michal Hocko <hidden> · 2011-03-28
[RFC 1/3] Add mem_cgroup->isolated and configuration knob · Michal Hocko <hidden> · 2011-03-28
[RFC 2/3] Implement isolated LRU cgroups · Michal Hocko <hidden> · 2011-03-28
[RFC 3/3] Do not shrink isolated groups from the global reclaim · Michal Hocko <hidden> · 2011-03-28
Re: [RFC 0/3] Implementation of cgroup isolation · KAMEZAWA Hiroyuki <hidden> · 2011-03-28
Re: [RFC 0/3] Implementation of cgroup isolation · Michal Hocko <hidden> · 2011-03-28
Re: [RFC 0/3] Implementation of cgroup isolation · KAMEZAWA Hiroyuki <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · Michal Hocko <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · KAMEZAWA Hiroyuki <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · Michal Hocko <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · KAMEZAWA Hiroyuki <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · Michal Hocko <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · Zhu Yanhai <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · Michal Hocko <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · Zhu Yanhai <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · Zhu Yanhai <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · Michal Hocko <hidden> · 2011-03-30
Re: [RFC 0/3] Implementation of cgroup isolation · Ying Han <hidden> · 2011-03-30
Re: [RFC 0/3] Implementation of cgroup isolation · Balbir Singh <bsingharora@gmail.com> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · Michal Hocko <hidden> · 2011-03-30
Re: [RFC 0/3] Implementation of cgroup isolation · Ying Han <hidden> · 2011-03-30
Re: [RFC 0/3] Implementation of cgroup isolation · Michal Hocko <hidden> · 2011-03-31
Re: [RFC 0/3] Implementation of cgroup isolation · Ying Han <hidden> · 2011-03-31
Re: [RFC 0/3] Implementation of cgroup isolation · Michal Hocko <hidden> · 2011-04-01
Re: [RFC 0/3] Implementation of cgroup isolation · Balbir Singh <hidden> · 2011-03-31
Re: [RFC 0/3] Implementation of cgroup isolation · Ying Han <hidden> · 2011-03-28
Re: [RFC 0/3] Implementation of cgroup isolation · KAMEZAWA Hiroyuki <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · Ying Han <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · KAMEZAWA Hiroyuki <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · KAMEZAWA Hiroyuki <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · Ying Han <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · Ying Han <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · KAMEZAWA Hiroyuki <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · Ying Han <hidden> · 2011-03-29
Re: [RFC 0/3] Implementation of cgroup isolation · Michal Hocko <hidden> · 2011-03-29

From: KAMEZAWA Hiroyuki <hidden>
Date: 2011-03-29 00:54:29
Also in: lkml

On Mon, 28 Mar 2011 17:37:02 -0700
Ying Han [off-list ref] wrote:

On Mon, Mar 28, 2011 at 5:12 PM, KAMEZAWA Hiroyuki
[off-list ref] wrote:

quoted

On Mon, 28 Mar 2011 11:01:18 -0700
Ying Han [off-list ref] wrote:

quoted

On Mon, Mar 28, 2011 at 2:39 AM, Michal Hocko [off-list ref] wrote:

quoted

Hi all,

Memory cgroups can be currently used to throttle memory usage of a group of
processes. It, however, cannot be used for an isolation of processes from
the rest of the system because all the pages that belong to the group are
also placed on the global LRU lists and so they are eligible for the global
memory reclaim.

This patchset aims at providing an opt-in memory cgroup isolation. This
means that a cgroup can be configured to be isolated from the rest of the
system by means of cgroup virtual filesystem (/dev/memctl/group/memory.isolated).

Thank you Hugh pointing me to the thread. We are working on similar
problem in memcg currently

Here is the problem we see:
1. In memcg, a page is both on per-memcg-per-zone lru and global-lru.
2. Global memory reclaim will throw page away regardless of cgroup.
3. The zone->lru_lock is shared between per-memcg-per-zone lru and global-lru.

And we know:
1. We shouldn't do global reclaim since it breaks memory isolation.
2. There is no need for a page to be on both LRU list, especially
after having per-memcg background reclaim.

So our approach is to take off page from global lru after it is
charged to a memcg. Only pages allocated at root cgroup remains in
global LRU, and each memcg reclaims pages on its isolated LRU.

Why you don't use cpuset and virtual nodes ? It's what you want.

We've been running cpuset + fakenuma nodes configuration in google to
provide memory isolation. The configuration of having the virtual box
is complex which user needs to know great details of the which node to
assign to which cgroup. That is one of the motivations for us moving
towards to memory controller which simply do memory accounting no
matter where pages are allocated.

I think current fake-numa is not useful because it works only at boot time.

By saying that, memcg simplified the memory accounting per-cgroup but
the memory isolation is broken. This is one of examples where pages
are shared between global LRU and per-memcg LRU. It is easy to get
cgroup-A's page evicted by adding memory pressure to cgroup-B.

If you overcommit....Right ?

The approach we are thinking to make the page->lru exclusive solve the
problem. and also we should be able to break the zone->lru_lock
sharing.

Is zone->lru_lock is a problem even with the help of pagevecs ?

If LRU management guys acks you to isolate LRUs and to make kswapd etc..
more complex, okay, we'll go that way. This will _change_ the whole
memcg design and concepts Maybe memcg should have some kind of balloon driver to
work happy with isolated lru.

But my current standing position is "never bad effects global reclaim".
So, I'm not very happy with the solution.

If we go that way, I guess we'll think we should have pseudo nodes/zones, which
was proposed in early days of resource controls.(not cgroup).

Thanks,
-Kame








--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help