Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed

[PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-10
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> · 2021-09-10
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-10
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Michal Hocko <mhocko@suse.com> · 2021-09-10
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-13
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Michal Hocko <mhocko@suse.com> · 2021-09-13
[PATCH mm] vmalloc: back off when the current task is OOM-killed · Vasily Averin <hidden> · 2021-09-17
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Andrew Morton <akpm@linux-foundation.org> · 2021-09-19
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> · 2021-09-20
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Vasily Averin <hidden> · 2021-09-20
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Andrew Morton <akpm@linux-foundation.org> · 2021-09-21
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Vasily Averin <hidden> · 2021-09-22
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Michal Hocko <mhocko@suse.com> · 2021-09-22
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Vasily Averin <hidden> · 2021-09-23
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Michal Hocko <mhocko@suse.com> · 2021-09-24
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Vasily Averin <hidden> · 2021-09-27
Re: [PATCH mm] vmalloc: back off when the current task is OOM-killed · Michal Hocko <mhocko@suse.com> · 2021-09-27
[PATCH mm v2] vmalloc: back off when the current task is OOM-killed · Vasily Averin <hidden> · 2021-10-05
Re: [PATCH mm v2] vmalloc: back off when the current task is OOM-killed · Vasily Averin <hidden> · 2021-10-05
Re: [PATCH mm v2] vmalloc: back off when the current task is OOM-killed · Michal Hocko <mhocko@suse.com> · 2021-10-07
Re: [PATCH mm v2] vmalloc: back off when the current task is OOM-killed · Andrew Morton <akpm@linux-foundation.org> · 2021-10-07
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-10
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-13
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Michal Hocko <mhocko@suse.com> · 2021-09-13
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-13
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Michal Hocko <mhocko@suse.com> · 2021-09-13
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Michal Hocko <mhocko@suse.com> · 2021-09-13
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-13
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Michal Hocko <mhocko@suse.com> · 2021-09-13
Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-14
[PATCH memcg v2] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-09-14
Re: [PATCH memcg v2] memcg: prohibit unconditional exceeding the limit of dying tasks · Michal Hocko <mhocko@suse.com> · 2021-09-16
[PATCH memcg v3] memcg: prohibit unconditional exceeding the limit of dying tasks · Vasily Averin <hidden> · 2021-10-05
Re: [PATCH memcg v3] memcg: prohibit unconditional exceeding the limit of dying tasks · Michal Hocko <mhocko@suse.com> · 2021-10-05

From: Andrew Morton <akpm@linux-foundation.org>
Date: 2021-09-21 18:55:27
Also in: linux-mm, lkml

On Mon, 20 Sep 2021 13:59:35 +0300 Vasily Averin [off-list ref] wrote:

On 9/20/21 4:22 AM, Tetsuo Handa wrote:

quoted

On 2021/09/20 8:31, Andrew Morton wrote:

quoted

On Fri, 17 Sep 2021 11:06:49 +0300 Vasily Averin [off-list ref] wrote:

quoted

Huge vmalloc allocation on heavy loaded node can lead to a global
memory shortage. A task called vmalloc can have the worst badness
and be chosen by OOM-killer, however received fatal signal and
oom victim mark does not interrupt allocation cycle. Vmalloc will
continue allocating pages over and over again, exacerbating the crisis
and consuming the memory freed up by another killed tasks.

This patch allows OOM-killer to break vmalloc cycle, makes OOM more
effective and avoid host panic.

Unfortunately it is not 100% safe. Previous attempt to break vmalloc
cycle was reverted by commit b8c8a338f75e ("Revert "vmalloc: back off when
the current task is killed"") due to some vmalloc callers did not handled
failures properly. Found issues was resolved, however, there may
be other similar places.

Well that was lame of us.

I believe that at least one of the kernel testbots can utilize fault
injection.  If we were to wire up vmalloc (as we have done with slab
and pagealloc) then this will help to locate such buggy vmalloc callers.

Andrew, could you please clarify how we can do it?
Do you mean we can use exsiting allocation fault injection infrastructure to trigger
such kind of issues? Unfortunately I found no ways to reach this goal.
It  allows to emulate single faults with small probability, however it is not enough,
we need to completely disable all vmalloc allocations.

I don't see why there's a problem?  You're saying "there might still be
vmalloc() callers which don't correctly handle allocation failures",
yes?

I'm suggesting that we use fault injection to cause a small proportion
of vmalloc() calls to artificially fail, so such buggy callers will
eventually be found and fixed.  Why does such a scheme require that
*all* vmalloc() calls fail?

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help