Thread (21 messages) 21 messages, 3 authors, 2017-10-04

Re: [v9 3/5] mm, oom: cgroup-aware OOM killer

From: Michal Hocko <mhocko@kernel.org>
Date: 2017-10-04 09:29:47
Also in: linux-mm, lkml

On Tue 03-10-17 07:35:59, Tejun Heo wrote:
Hello, Michal.

On Tue, Oct 03, 2017 at 04:22:46PM +0200, Michal Hocko wrote:
quoted
On Tue 03-10-17 15:08:41, Roman Gushchin wrote:
quoted
On Tue, Oct 03, 2017 at 03:36:23PM +0200, Michal Hocko wrote:
[...]
quoted
quoted
I guess we want to inherit the value on the memcg creation but I agree
that enforcing parent setting is weird. I will think about it some more
but I agree that it is saner to only enforce per memcg value.
I'm not against, but we should come up with a good explanation, why we're
inheriting it; or not inherit.
Inheriting sounds like a less surprising behavior. Once you opt in for
oom_group you can expect that descendants are going to assume the same
unless they explicitly state otherwise.
Here's a counter example.

Let's say there's a container which hosts one main application, and
the container shares its host with other containers.

* Let's say the container is a regular containerized OS instance and
  can't really guarantee system integrity if one its processes gets
  randomly killed.

* However, the application that it's running inside an isolated cgroup
  is more intelligent and composed of multiple interchangeable
  processes and can treat killing of a random process as partial
  capacity loss.

When the host is setting up the outer container, it doesn't
necessarily know whether the containerized environment would be able
to handle partial OOM kills or not.  It's akin to panic_on_oom setting
at system level - it's the containerized instance itself which knows
whether it can handle partial OOM kills or not.  This is why this knob
should be delegatable.

Now, the container itself has group OOM set and the isolated main
application is starting up.  It obviously wants partial OOM kills
rather than group killing.  This is the same principle.  The
application which is being contained in the cgroup is the one which
knows how it can handle OOM conditions, not the outer environment, so
it obviously needs to be able to set the configuration it wants.
Yes this makes a lot of sense. On the other hand we used to copy other
reclaim specific atributes like swappiness and oom_kill_disable.

I guess we should be OK with "non-hierarchical" behavior when it is
documented properly so that there are surpasses.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help