Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update

[RFC 0/6] cpuset/mempolicies related fixes and cleanups · Vlastimil Babka <hidden> · 2017-04-11
[RFC 2/6] mm, mempolicy: stop adjusting current->il_next in mpol_rebind_nodemask() · Vlastimil Babka <hidden> · 2017-04-11
Re: [RFC 2/6] mm, mempolicy: stop adjusting current->il_next in mpol_rebind_nodemask() · Christoph Lameter <hidden> · 2017-04-11
Re: [RFC 2/6] mm, mempolicy: stop adjusting current->il_next in mpol_rebind_nodemask() · Vlastimil Babka <hidden> · 2017-04-11
Re: [RFC 2/6] mm, mempolicy: stop adjusting current->il_next in mpol_rebind_nodemask() · Vlastimil Babka <hidden> · 2017-04-12
Re: [RFC 2/6] mm, mempolicy: stop adjusting current->il_next in mpol_rebind_nodemask() · Christoph Lameter <hidden> · 2017-04-12
Re: [RFC 2/6] mm, mempolicy: stop adjusting current->il_next in mpol_rebind_nodemask() · Vlastimil Babka <hidden> · 2017-04-12
[RFC 6/6] mm, mempolicy: don't check cpuset seqlock where it doesn't matter · Vlastimil Babka <hidden> · 2017-04-11
[RFC 5/6] mm, cpuset: always use seqlock when changing task's nodemask · Vlastimil Babka <hidden> · 2017-04-11
Re: [RFC 5/6] mm, cpuset: always use seqlock when changing task's nodemask · Hillf Danton <hidden> · 2017-04-12
Re: [RFC 5/6] mm, cpuset: always use seqlock when changing task's nodemask · Vlastimil Babka <hidden> · 2017-04-12
[RFC 4/6] mm, mempolicy: simplify rebinding mempolicies when updating cpusets · Vlastimil Babka <hidden> · 2017-04-11
[RFC 3/6] mm, page_alloc: pass preferred nid instead of zonelist to allocator · Vlastimil Babka <hidden> · 2017-04-11
[RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Vlastimil Babka <hidden> · 2017-04-11
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Christoph Lameter <hidden> · 2017-04-11
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Vlastimil Babka <hidden> · 2017-04-11
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Christoph Lameter <hidden> · 2017-04-12
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Vlastimil Babka <hidden> · 2017-04-13
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Christoph Lameter <hidden> · 2017-04-14
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Vlastimil Babka <hidden> · 2017-04-26
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Christoph Lameter <hidden> · 2017-04-30
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Michal Hocko <mhocko@kernel.org> · 2017-05-17
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Christoph Lameter <hidden> · 2017-05-17
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Michal Hocko <mhocko@kernel.org> · 2017-05-17
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Christoph Lameter <hidden> · 2017-05-17
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Michal Hocko <mhocko@kernel.org> · 2017-05-17
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Christoph Lameter <hidden> · 2017-05-17
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Michal Hocko <mhocko@kernel.org> · 2017-05-18
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Christoph Lameter <hidden> · 2017-05-18
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Michal Hocko <mhocko@kernel.org> · 2017-05-18
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Christoph Lameter <hidden> · 2017-05-18
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Michal Hocko <mhocko@kernel.org> · 2017-05-19
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Christoph Lameter <hidden> · 2017-05-17
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Vlastimil Babka <hidden> · 2017-05-18
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Christoph Lameter <hidden> · 2017-05-18
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Vlastimil Babka <hidden> · 2017-05-19
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Anshuman Khandual <hidden> · 2017-04-13
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Vlastimil Babka <hidden> · 2017-04-13
Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update · Vlastimil Babka <hidden> · 2017-04-13

From: Michal Hocko <hidden>
Date: 2017-05-17 14:05:07
Also in: linux-api, linux-mm, lkml

On Wed 17-05-17 08:56:34, Cristopher Lameter wrote:

On Wed, 17 May 2017, Michal Hocko wrote:

quoted

We certainly can do that. The failure of the page faults are due to the
admin trying to move an application that is not aware of this and is using
mempols. That could be an error. Trying to move an application that
contains both absolute and relative node numbers is definitely something
that is potentiall so screwed up that the kernel should not muck around
with such an app.

Also user space can determine if the application is using memory policies
and can then take appropriate measures (message to the sysadmin to eval
tge situation f.e.) or mess aroud with the processes memory policies on
its own.

So this is certainly a way out of this mess.

So how are you going to distinguish VM_FAULT_OOM from an empty mempolicy
case in a raceless way?

You dont have to do that if you do not create an empty mempolicy in the
first place. The current kernel code avoids that by first allowing access
to the new set of nodes and removing the old ones from the set when done.

which is racy and as Vlastimil pointed out. If we simply fail such an
allocation the failure will go up the call chain until we hit the OOM
killer due to VM_FAULT_OOM. How would you want to handle that?
-- 
Michal Hocko
SUSE Labs

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help