Thread (35 messages) 35 messages, 11 authors, 2016-08-29

Re: OOM detection regressions since 4.7

From: Greg KH <gregkh@linuxfoundation.org>
Date: 2016-08-22 13:31:55
Also in: lkml

On Mon, Aug 22, 2016 at 12:54:41PM +0200, Michal Hocko wrote:
On Mon 22-08-16 06:05:28, Greg KH wrote:
quoted
On Mon, Aug 22, 2016 at 11:37:07AM +0200, Michal Hocko wrote:
[...]
quoted
quoted
quoted
From 899b738538de41295839dca2090a774bdd17acd2 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Mon, 22 Aug 2016 10:52:06 +0200
Subject: [PATCH] mm, oom: prevent pre-mature OOM killer invocation for high
 order request

There have been several reports about pre-mature OOM killer invocation
in 4.7 kernel when order-2 allocation request (for the kernel stack)
invoked OOM killer even during basic workloads (light IO or even kernel
compile on some filesystems). In all reported cases the memory is
fragmented and there are no order-2+ pages available. There is usually
a large amount of slab memory (usually dentries/inodes) and further
debugging has shown that there are way too many unmovable blocks which
are skipped during the compaction. Multiple reporters have confirmed that
the current linux-next which includes [1] and [2] helped and OOMs are
not reproducible anymore. A simpler fix for the stable is to simply
ignore the compaction feedback and retry as long as there is a reclaim
progress for high order requests which we used to do before. We already
do that for CONFING_COMPACTION=n so let's reuse the same code when
compaction is enabled as well.

[1] http://lkml.kernel.org/r/20160810091226.6709-1-vbabka@suse.cz
[2] http://lkml.kernel.org/r/f7a9ea9d-bb88-bfd6-e340-3a933559305a@suse.cz

Fixes: 0a0337e0d1d1 ("mm, oom: rework oom detection")
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/page_alloc.c | 50 ++------------------------------------------------
 1 file changed, 2 insertions(+), 48 deletions(-)
So, if this goes into Linus's tree, can you let stable@vger.kernel.org
know about it so we can add it to the 4.7-stable tree?  Otherwise
there's not much I can do here now, right?
My plan would be actually to not push this to Linus because we have a
proper fix for Linus tree. It is just that the fix is quite large and I
felt like the stable should get the most simple fix possible, which is
this partial revert. So, what I am trying to tell is to push a non-linus
patch to stable as it is simpler.
I _REALLY_ hate taking any patches that are not in Linus's tree as 90%
of the time (well, almost always), it ends up being wrong and hurting us
in the end.

What exactly are the commits that are in Linus's tree that resolve this
issue?

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help