Thread (161 messages) 161 messages, 7 authors, 2013-07-25

Re: [PATCH -mm] memcg: do not trigger OOM from add_to_page_cache_locked

From: Michal Hocko <hidden>
Date: 2012-11-26 19:03:35
Also in: linux-mm, lkml

On Mon 26-11-12 13:24:21, Johannes Weiner wrote:
On Mon, Nov 26, 2012 at 07:04:44PM +0100, Michal Hocko wrote:
quoted
On Mon 26-11-12 12:46:22, Johannes Weiner wrote:
[...]
quoted
quoted
I think global oom already handles this in a much better way: invoke
the OOM killer, sleep for a second, then return to userspace to
relinquish all kernel resources and locks.  The only reason why we
can't simply change from an endless retry loop is because we don't
want to return VM_FAULT_OOM and invoke the global OOM killer.
Exactly.
quoted
But maybe we can return a new VM_FAULT_OOM_HANDLED for memcg OOM and
just restart the pagefault.  Return -ENOMEM to the buffered IO syscall
respectively.  This way, the memcg OOM killer is invoked as it should
but nobody gets stuck anywhere livelocking with the exiting task.
Hmm, we would still have a problem with oom disabled (aka user space OOM
killer), right? All processes but those in mem_cgroup_handle_oom are
risky to be killed.
Could we still let everybody get stuck in there when the OOM killer is
disabled and let userspace take care of it?
I am not sure what exactly you mean by "userspace take care of it" but
if those processes are stuck and holding the lock then it is usually
hard to find that out. Well if somebody is familiar with internal then
it is doable but this makes the interface really unusable for regular
usage.
quoted
Other POV might be, why we should trigger an OOM killer from those paths
in the first place. Write or read (or even readahead) are all calls that
should rather fail than cause an OOM killer in my opinion.
Readahead is arguable, but we kill globally for read() and write() and
I think we should do the same for memcg.
Fair point but the global case is little bit easier than memcg in this
case because nobody can hook on OOM killer and provide a userspace
implementation for it which is one of the cooler feature of memcg...
I am all open to any suggestions but we should somehow fix this (and
backport it to stable trees as this is there for quite some time. The
current report shows that the problem is not that hard to trigger).
The OOM killer is there to resolve a problem that comes from
overcommitting the machine but the overuse does not have to be from
the application that pushes the machine over the edge, that's why we
don't just kill the allocating task but actually go look for the best
candidate.  If you have one memory hog that overuses the resources,
attempted memory consumption in a different program should invoke the
OOM killer.  
It does not matter if this is a page fault (would still happen with
your patch) or a bufferd read/write (would no longer happen).
true and it is sad that mmap then behaves slightly different than
read/write which should I've mentioned in the changelog. As I said I am
open to other suggestions.

Thanks
-- 
Michal Hocko
SUSE Labs
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help