Thread (79 messages) 79 messages, 8 authors, 2017-10-02

Re: [v8 0/4] cgroup-aware OOM killer

From: Roman Gushchin <hidden>
Date: 2017-09-27 10:19:51
Also in: linux-mm, lkml

On Wed, Sep 27, 2017 at 09:43:19AM +0200, Michal Hocko wrote:
On Tue 26-09-17 20:37:37, Tim Hockin wrote:
[...]
quoted
I feel like David has offered examples here, and many of us at Google
have offered examples as long ago as 2013 (if I recall) of cases where
the proposed heuristic is EXACTLY WRONG.
I do not think we have discussed anything resembling the current
approach. And I would really appreciate some more examples where
decisions based on leaf nodes would be EXACTLY WRONG.
I would agree here.

The discussing two-step approach (select biggest leaf or oom_group memcg,
then select largest process inside) does really look as a way to go.

It should work well in practice and it allows further development.
It will catch workloads which are leaking child processes by default,
which is an advantage in comparison to the existing algorithm.

Both strong hierarchical approach (as in v8) and pure flat (by Johannes)
are more limiting. In first case, deep hierarchies are affected (as Michal
mentioned) and we stick with tree traverse policy (Tejun's point).

In second case, the further development is under a question: any new idea
(say, oom_priorities, or, for example, if we will have a new useful memcg
metric) should be applied to processes and memcgs simultaneously.
Also, We drop any idea of memcg-level fairness and obtain some implementation
issues (which I mentioned earlier). The idea of mixing tasks and memcgs
leads to a much more hairy code, and the OOM code is already quite hairy.
The idea of comparing killable entities is a leaking abstraction,
as we can't predict how much memory killing a single process will release
(say, for example, the process is the init in a pid namespace).
quoted
We need OOM behavior to kill in a deterministic order configured by
policy.
And nobody is objecting to this usecase. I think we can build a priority
policy on top of leaf-based decision as well. The main point we are
trying to sort out here is a reasonable semantic that would work for
most workloads. Sibling based selection will simply not work on those
that have to use deeper hierarchies for organizational purposes. I
haven't heard a counter argument for that example yet.
Yes, implementing oom_priorities is a ~15 lines patch on top of
the discussing approach. David can use this small off-stream patch
for now, in any case it's a step forward in comparison to the existing state.


Overall, do we have any open question left? Does anyone has any strong
arguments against the discussing design?

Thanks!
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help