Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed

[PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-10
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> · 2021-09-10
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-10
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Michal Hocko <mhocko@suse.com> · 2021-09-10
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-13
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Michal Hocko <mhocko@suse.com> · 2021-09-13
[PATCH mm] vmalloc: back off when the current task is OOM-killed · Vasily Averin <hidden> · 2021-09-17
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Andrew Morton <akpm@linux-foundation.org> · 2021-09-19
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> · 2021-09-20
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Vasily Averin <hidden> · 2021-09-20
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Andrew Morton <akpm@linux-foundation.org> · 2021-09-21
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Vasily Averin <hidden> · 2021-09-22
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Michal Hocko <mhocko@suse.com> · 2021-09-22
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Vasily Averin <hidden> · 2021-09-23
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Michal Hocko <mhocko@suse.com> · 2021-09-24
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Vasily Averin <hidden> · 2021-09-27
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Michal Hocko <mhocko@suse.com> · 2021-09-27
[PATCH mm v2] vmalloc: back off when the current task is OOM-killed · Vasily Averin <hidden> · 2021-10-05
Re: [PATCH mm v2] vmalloc: back off when the current task is OOM-killed · Vasily Averin <hidden> · 2021-10-05
Re: [PATCH mm v2] vmalloc: back off when the current task is OOM-killed · Michal Hocko <mhocko@suse.com> · 2021-10-07
Re: [PATCH mm v2] vmalloc: back off when the current task is OOM-killed · Andrew Morton <akpm@linux-foundation.org> · 2021-10-07
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-10
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-13
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Michal Hocko <mhocko@suse.com> · 2021-09-13
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-13
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Michal Hocko <mhocko@suse.com> · 2021-09-13
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Michal Hocko <mhocko@suse.com> · 2021-09-13
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-13
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Michal Hocko <mhocko@suse.com> · 2021-09-13
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-14
[PATCH memcg v2] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-14
Re: [PATCH memcg v2] memcg: prohibit unconditional exceeding the limit of dying tasks · Michal Hocko <mhocko@suse.com> · 2021-09-16
[PATCH memcg v3] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-10-05
Re: [PATCH memcg v3] memcg: prohibit unconditional exceeding the limit of dying tasks · Michal Hocko <mhocko@suse.com> · 2021-10-05

From: Michal Hocko <hidden>
Date: 2021-09-22 12:27:42
Also in: linux-mm, lkml

On Fri 17-09-21 11:06:49, Vasily Averin wrote:

Huge vmalloc allocation on heavy loaded node can lead to a global
memory shortage. A task called vmalloc can have the worst badness
and be chosen by OOM-killer, however received fatal signal and
oom victim mark does not interrupt allocation cycle. Vmalloc will
continue allocating pages over and over again, exacerbating the crisis
and consuming the memory freed up by another killed tasks.

This patch allows OOM-killer to break vmalloc cycle, makes OOM more
effective and avoid host panic.

Unfortunately it is not 100% safe. Previous attempt to break vmalloc
cycle was reverted by commit b8c8a338f75e ("Revert "vmalloc: back off when
the current task is killed"") due to some vmalloc callers did not handled
failures properly. Found issues was resolved, however, there may
be other similar places.

Such failures may be acceptable for emergencies, such as OOM. On the other
hand, we would like to detect them earlier. However they are quite rare,
and will be hidden by OOM messages, so I'm afraid they wikk have quite
small chance of being noticed and reported.

To improve the detection of such places this patch also interrupts the vmalloc
allocation cycle for all fatal signals. The checks are hidden under DEBUG_VM
config option to do not break unaware production kernels.

I really dislike this. We shouldn't have a sementically different
behavior for a debugging kernel.

Is there any technical reason to not do fatal_signal_pending bailout
unconditionally? OOM victim based check will make it less likely and
therefore any potential bugs are just hidden more. So I think we should
really go with fatal_signal_pending check here.

quoted hunk ↗ jump to hunk

Vmalloc uses new alloc_pages_bulk subsystem, so newly added checks can
affect other users of this subsystem.

Signed-off-by: Vasily Averin <redacted>
---
 mm/page_alloc.c | 5 +++++
 mm/vmalloc.c    | 6 ++++++
 2 files changed, 11 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b37435c274cf..133d52e507ff 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c

@@ -5288,6 +5288,11 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
 			continue;
 		}
 
+		if (tsk_is_oom_victim(current) ||
+		    (IS_ENABLED(CONFIG_DEBUG_VM) &&
+		     fatal_signal_pending(current)))
+			break;

This allocator interface is used in some real hot paths. It is also
meant to be fail fast interface (e.g. it only allocates from pcp
allocator) so it shouldn't bring any additional risk to memory depletion
under heavy memory pressure.

In other words I do not see any reason to bail out in this code path.

-- 
Michal Hocko
SUSE Labs

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help