Re: [v6 2/4] mm, oom: cgroup-aware OOM killer

[v6 1/4] mm, oom: refactor the oom_kill_process() function · Roman Gushchin <hidden> · 2017-08-23
[v6 3/4] mm, oom: introduce oom_priority for memory cgroups · Roman Gushchin <hidden> · 2017-08-23
Re: [v6 3/4] mm, oom: introduce oom_priority for memory cgroups · Michal Hocko <mhocko@kernel.org> · 2017-08-24
Re: [v6 3/4] mm, oom: introduce oom_priority for memory cgroups · Roman Gushchin <hidden> · 2017-08-24
Re: [v6 3/4] mm, oom: introduce oom_priority for memory cgroups · Michal Hocko <mhocko@kernel.org> · 2017-08-24
Re: [v6 3/4] mm, oom: introduce oom_priority for memory cgroups · Roman Gushchin <hidden> · 2017-08-24
Re: [v6 3/4] mm, oom: introduce oom_priority for memory cgroups · David Rientjes <rientjes@google.com> · 2017-08-28
[v6 2/4] mm, oom: cgroup-aware OOM killer · Roman Gushchin <hidden> · 2017-08-23
Re: [v6 2/4] mm, oom: cgroup-aware OOM killer · David Rientjes <rientjes@google.com> · 2017-08-23
Re: [v6 2/4] mm, oom: cgroup-aware OOM killer · Roman Gushchin <hidden> · 2017-08-25
Re: [v6 2/4] mm, oom: cgroup-aware OOM killer · Michal Hocko <mhocko@kernel.org> · 2017-08-24
Re: [v6 2/4] mm, oom: cgroup-aware OOM killer · Roman Gushchin <hidden> · 2017-08-24
Re: [v6 2/4] mm, oom: cgroup-aware OOM killer · Michal Hocko <mhocko@kernel.org> · 2017-08-24
Re: [v6 2/4] mm, oom: cgroup-aware OOM killer · Roman Gushchin <hidden> · 2017-08-24
Re: [v6 2/4] mm, oom: cgroup-aware OOM killer · Michal Hocko <mhocko@kernel.org> · 2017-08-24
Re: [v6 2/4] mm, oom: cgroup-aware OOM killer · Roman Gushchin <hidden> · 2017-08-24
Re: [v6 2/4] mm, oom: cgroup-aware OOM killer · Michal Hocko <mhocko@kernel.org> · 2017-08-25
Re: [v6 2/4] mm, oom: cgroup-aware OOM killer · Roman Gushchin <hidden> · 2017-08-25
Re: [v6 2/4] mm, oom: cgroup-aware OOM killer · Michal Hocko <mhocko@kernel.org> · 2017-08-25
Re: [v6 2/4] mm, oom: cgroup-aware OOM killer · Roman Gushchin <hidden> · 2017-08-30
Re: [v6 2/4] mm, oom: cgroup-aware OOM killer · David Rientjes <rientjes@google.com> · 2017-08-30
Re: [v6 2/4] mm, oom: cgroup-aware OOM killer · Roman Gushchin <hidden> · 2017-08-31
Re: [v6 2/4] mm, oom: cgroup-aware OOM killer · David Rientjes <rientjes@google.com> · 2017-08-31
[v6 4/4] mm, oom, docs: describe the cgroup-aware OOM killer · Roman Gushchin <hidden> · 2017-08-23
[v6 0/4] cgroup-aware OOM killer · Roman Gushchin <hidden> · 2017-08-23
Re: [v6 1/4] mm, oom: refactor the oom_kill_process() function · Michal Hocko <mhocko@kernel.org> · 2017-08-24

From: Roman Gushchin <hidden>
Date: 2017-08-24 13:59:16
Also in: linux-mm, lkml

On Thu, Aug 24, 2017 at 02:58:11PM +0200, Michal Hocko wrote:

On Thu 24-08-17 13:28:46, Roman Gushchin wrote:

quoted

Hi Michal!

There is nothing like a "better victim". We are pretty much in a
catastrophic situation when we try to survive by killing a userspace.

Not necessary, it can be a cgroup OOM.

We try to kill the largest because that assumes that we return the
most memory from it. Now I do understand that you want to treat the
memcg as a single killable entity but I find it really questionable
to do a per-memcg metric and then do not treat it like that and kill
only a single task. Just imagine a single memcg with zillions of taks
each very small and you select it as the largest while a small taks
itself doesn't help to help to get us out of the OOM.

I don't think it's different from a non-containerized state: if you
have a zillion of small tasks in the system, you'll meet the same issues.

quoted

I guess I have asked already and we haven't reached any consensus. I do
not like how you treat memcgs and tasks differently. Why cannot we have
a memcg score a sum of all its tasks?

It sounds like a more expensive way to get almost the same with less accuracy.
Why it's better?

because then you are comparing apples to apples?

Well, I can say that I compare some number of pages against some other number
of pages. And the relation between a page and memcg is more obvious, than a
relation between a page and a process.

Both ways are not ideal, and sum of the processes is not ideal too.
Especially, if you take oom_score_adj into account. Will you respect it?

I've started actually with such approach, but then found it weird.

Besides that you have
to check each task for over-killing anyway. So I do not see any
performance merits here.

It's an implementation detail, and we can hopefully get rid of it at some point.

quoted

How do you want to compare memcg score with tasks score?

I have to do it for tasks in root cgroups, but it shouldn't be a common case.

How come? I can easily imagine a setup where only some memcgs which
really do need a kill-all semantic while all others can live with single
task killed perfectly fine.

I mean taking a unified cgroup hierarchy into an account, there should not
be lot of tasks in the root cgroup, if any.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help