Re: [PATCH 08/10] exit, oom: postpone exit_oom_victim to later

[RFC PATCH 0/10] fortify oom killer even more · Michal Hocko <mhocko@kernel.org> · 2016-07-28
[PATCH 01/10] mm,oom_reaper: Reduce find_lock_task_mm() usage. · Michal Hocko <mhocko@kernel.org> · 2016-07-28
[PATCH 02/10] mm,oom_reaper: Do not attempt to reap a task twice. · Michal Hocko <mhocko@kernel.org> · 2016-07-28
[PATCH 03/10] oom: keep mm of the killed task available · Michal Hocko <mhocko@kernel.org> · 2016-07-28
[PATCH 04/10] mm, oom: get rid of signal_struct::oom_victims · Michal Hocko <mhocko@kernel.org> · 2016-07-28
[PATCH 05/10] kernel, oom: fix potential pgd_lock deadlock from __mmdrop · Michal Hocko <mhocko@kernel.org> · 2016-07-28
[PATCH 06/10] oom, suspend: fix oom_killer_disable vs. pm suspend properly · Michal Hocko <mhocko@kernel.org> · 2016-07-28
[PATCH 07/10] mm, oom: enforce exit_oom_victim on current task · Michal Hocko <mhocko@kernel.org> · 2016-07-28
[PATCH 08/10] exit, oom: postpone exit_oom_victim to later · Michal Hocko <mhocko@kernel.org> · 2016-07-28
Re: [PATCH 08/10] exit, oom: postpone exit_oom_victim to later · Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> · 2016-07-30
Re: [PATCH 08/10] exit, oom: postpone exit_oom_victim to later · Michal Hocko <mhocko@kernel.org> · 2016-07-31
Re: [PATCH 08/10] exit, oom: postpone exit_oom_victim to later · Michal Hocko <mhocko@kernel.org> · 2016-07-31
Re: [PATCH 08/10] exit, oom: postpone exit_oom_victim to later · Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> · 2016-08-01
Re: [PATCH 08/10] exit, oom: postpone exit_oom_victim to later · Michal Hocko <mhocko@kernel.org> · 2016-08-01
Re: [PATCH 08/10] exit, oom: postpone exit_oom_victim to later · Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> · 2016-08-02
Re: [PATCH 08/10] exit, oom: postpone exit_oom_victim to later · Michal Hocko <mhocko@kernel.org> · 2016-08-02
[PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Michal Hocko <mhocko@kernel.org> · 2016-07-28
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · "Michael S. Tsirkin" <mst@redhat.com> · 2016-07-28
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Michal Hocko <mhocko@kernel.org> · 2016-07-29
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · "Michael S. Tsirkin" <mst@redhat.com> · 2016-07-29
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Michal Hocko <mhocko@kernel.org> · 2016-07-29
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · "Michael S. Tsirkin" <mst@redhat.com> · 2016-07-29
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Michal Hocko <mhocko@kernel.org> · 2016-07-31
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Michal Hocko <mhocko@kernel.org> · 2016-08-12
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Oleg Nesterov <oleg@redhat.com> · 2016-08-12
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Michal Hocko <mhocko@kernel.org> · 2016-08-12
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Oleg Nesterov <oleg@redhat.com> · 2016-08-12
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Paul E. McKenney <hidden> · 2016-08-12
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Oleg Nesterov <oleg@redhat.com> · 2016-08-12
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Paul E. McKenney <hidden> · 2016-08-12
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Michal Hocko <mhocko@kernel.org> · 2016-08-12
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · "Michael S. Tsirkin" <mst@redhat.com> · 2016-08-13
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Michal Hocko <mhocko@kernel.org> · 2016-08-14
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · "Michael S. Tsirkin" <mst@redhat.com> · 2016-08-14
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · "Michael S. Tsirkin" <mst@redhat.com> · 2016-08-14
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Michal Hocko <mhocko@kernel.org> · 2016-08-15
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Michal Hocko <mhocko@kernel.org> · 2016-08-17
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Michal Hocko <mhocko@kernel.org> · 2016-08-22
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · "Michael S. Tsirkin" <mst@redhat.com> · 2016-08-22
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Michal Hocko <mhocko@kernel.org> · 2016-08-23
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Michal Hocko <mhocko@kernel.org> · 2016-08-23
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · "Michael S. Tsirkin" <mst@redhat.com> · 2016-08-23
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Michal Hocko <mhocko@kernel.org> · 2016-08-24
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Michal Hocko <mhocko@kernel.org> · 2016-08-12
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Oleg Nesterov <oleg@redhat.com> · 2016-07-29
Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost · Michal Hocko <mhocko@kernel.org> · 2016-07-31
[PATCH 10/10] oom, oom_reaper: allow to reap mm shared by the kthreads · Michal Hocko <mhocko@kernel.org> · 2016-07-28

From: Michal Hocko <mhocko@kernel.org>
Date: 2016-08-02 11:31:32

On Tue 02-08-16 19:32:45, Tetsuo Handa wrote:

Michal Hocko wrote:

quoted

It is possible that a user creates a process with 10000 threads
and let that process be OOM-killed. Then, this patch allows 10000 threads
to start consuming memory reserves after they left exit_mm(). OOM victims
are not the only threads who need to allocate memory for termination. Non
OOM victims might need to allocate memory at exit_task_work() in order to
allow OOM victims to make forward progress.

this might be possible but unlike the regular exiting tasks we do
reclaim oom victim's memory in the background. So while they can consume
memory reserves we should also give some (and arguably much more) memory
back. The reserves are there to expedite the exit.

Background reclaim does not occur on CONFIG_MMU=n kernels. But this patch
also affects CONFIG_MMU=n kernels. If a process with two threads was
OOM-killed and one thread consumed too much memory after it left exit_mm()
before the other thread sets MMF_OOM_SKIP on their mm by returning from
exit_aio() etc. in __mmput() from mmput() from exit_mm(), this patch
introduces a new possibility to OOM livelock. I think it is wild to assume
that "CONFIG_MMU=n kernels can OOM livelock even without this patch. Thus,
let's apply this patch even though this patch might break the balance of
OOM handling in CONFIG_MMU=n kernels."

As I've said if you have strong doubts about the patch I can drop it for
now. I do agree that nommu really matters here, though.

OK. Then, for now let's postpone only the oom_killer_disbale() to later
rather than postpone the exit_oom_victim() to later.

that would require other changes (basically make oom_killer_disbale
independent on TIF_MEMDIE) which I think doesn't belong to this pile. So
I would rather sacrifice this patch instead and it will not be part of
the v2.
 
[...]

quoted

I think that allocations from
do_exit() are important for terminating cleanly (from the point of view of
filesystem integrity and kernel object management) and such allocations
should not be given up simply because ALLOC_NO_WATERMARKS allocations
failed.

We are talking about a fatal condition when OOM killer forcefully kills
a task. Chances are that the userspace leaves so much state behind that
a manual cleanup would be necessary anyway. Depleting the memory
reserves is not nice but I really believe that this particular patch
doesn't make the situation really much worse than before.

I'm not talking about inconsistency in userspace programs. I'm talking
about inconsistency of objects managed by kernel (e.g. failing to drop
references) caused by allocation failures.

That would be a bug on its own, no?

Right, but memory allocations after exit_mm() from do_exit() (e.g.
exit_task_work()) might assume (or depend on) the "too small to fail"
memory-allocation rule where small GFP_FS allocations won't fail unless
TIF_MEMDIE is set, but this patch can unexpectedly break that rule if
they assume (or depend on) that rule.

Silent dependency on nofail semantic withtou GFP_NOFAIL is still a bug.
Full stop. I really fail to see why you are still arguing about that.

[...]
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help