Re: [PATCH] mm/page_alloc: Wait for oom_lock before retrying.
From: Michal Hocko <mhocko@suse.com>
Date: 2016-12-09 14:46:28
Subsystem:
memory management, memory management - page allocator, the rest · Maintainers:
Andrew Morton, Vlastimil Babka, Linus Torvalds
On Fri 09-12-16 23:23:10, Tetsuo Handa wrote:
Michal Hocko wrote:quoted
On Thu 08-12-16 00:29:26, Tetsuo Handa wrote:quoted
Michal Hocko wrote:quoted
On Tue 06-12-16 19:33:59, Tetsuo Handa wrote:quoted
If the OOM killer is invoked when many threads are looping inside the page allocator, it is possible that the OOM killer is preempted by other threads.Hmm, the only way I can see this would happen is when the task which actually manages to take the lock is not invoking the OOM killer for whatever reason. Is this what happens in your case? Are you able to trigger this reliably?Regarding http://I-love.SAKURA.ne.jp/tmp/serial-20161206.txt.xz , somebody called oom_kill_process() and reached pr_err("%s: Kill process %d (%s) score %u or sacrifice child\n", line but did not reach pr_err("Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-rss:%lukB, shmem-rss:%lukB\n", line within tolerable delay.I would be really interested in that. This can happen only if find_lock_task_mm fails. This would mean that either we are selecting a child without mm or the selected victim has no mm anymore. Both cases should be ephemeral because oom_badness will rule those tasks on the next round. So the primary question here is why no other task has hit out_of_memory.This can also happen due to AB-BA livelock (oom_lock v.s. console_sem).
Care to explain how would that livelock look like?
quoted hunk ↗ jump to hunk
quoted
Have you tried to instrument the kernel and see whether GFP_NOFS contexts simply preempted any other attempt to get there? I would find it quite unlikely but not impossible. If that is the case we should really think how to move forward. One way is to make the oom path fully synchronous as suggested below. Other is to tweak GFP_NOFS some more and do not take the lock while we are evaluating that. This sounds quite messy though.Do you mean "tweak GFP_NOFS" as something like below patch?--- a/mm/page_alloc.c +++ b/mm/page_alloc.c@@ -3036,6 +3036,17 @@ void warn_alloc(gfp_t gfp_mask, const char *fmt, ...) *did_some_progress = 0; + if (!(gfp_mask & (__GFP_FS | __GFP_NOFAIL))) { + if ((current->flags & PF_DUMPCORE) || + (order > PAGE_ALLOC_COSTLY_ORDER) || + (ac->high_zoneidx < ZONE_NORMAL) || + (pm_suspended_storage()) || + (gfp_mask & __GFP_THISNODE)) + return NULL; + *did_some_progress = 1; + return NULL; + } + /* * Acquire the oom lock. If that fails, somebody else is * making progress for us.Then, serial-20161209-gfp.txt in http://I-love.SAKURA.ne.jp/tmp/20161209.tar.xz is console log with above patch applied. Spinning without invoking the OOM killer. It did not avoid locking up.
OK, so the reason of the lock up must be something different. If we are
really {dead,live}locking on the printk because of warn_alloc then that
path should be tweaked instead. Something like below should rule this
out:
---diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ed65d7df72d5..c2ba51cec93d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c@@ -3024,11 +3024,14 @@ void warn_alloc(gfp_t gfp_mask, const char *fmt, ...) unsigned int filter = SHOW_MEM_FILTER_NODES; struct va_format vaf; va_list args; + static DEFINE_MUTEX(warn_lock); if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs) || debug_guardpage_minorder() > 0) return; + mutex_lock(&warn_lock); + /* * This documents exceptions given to allocations in certain * contexts that are allowed to allocate outside current's set
@@ -3054,6 +3057,8 @@ void warn_alloc(gfp_t gfp_mask, const char *fmt, ...) dump_stack(); if (!should_suppress_show_mem()) show_mem(filter); + + mutex_unlock(&warn_lock); } static inline struct page *
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>