Thread (37 messages) 37 messages, 5 authors, 2021-05-06

Re: [PATCH v1 3/7] mm: rename and move page_is_poisoned()

From: Michal Hocko <mhocko@suse.com>
Date: 2021-05-06 07:06:21
Also in: linux-fsdevel, linux-hyperv, lkml

On Thu 06-05-21 08:56:11, Aili Yao wrote:
On Wed, 5 May 2021 15:27:39 +0200
Michal Hocko [off-list ref] wrote:
quoted
On Wed 05-05-21 15:17:53, David Hildenbrand wrote:
quoted
On 05.05.21 15:13, Michal Hocko wrote:  
quoted
On Thu 29-04-21 14:25:15, David Hildenbrand wrote:  
quoted
Commit d3378e86d182 ("mm/gup: check page posion status for coredump.")
introduced page_is_poisoned(), however, v5 [1] of the patch used
"page_is_hwpoison()" and something went wrong while upstreaming. Rename the
function and move it to page-flags.h, from where it can be used in other
-- kcore -- context.

Move the comment to the place where it belongs and simplify.

[1] https://lkml.kernel.org/r/20210322193318.377c9ce9@alex-virtual-machine

Signed-off-by: David Hildenbrand <redacted>  
I do agree that being explicit about hwpoison is much better. Poisoned
page can be also an unitialized one and I believe this is the reason why
you are bringing that up.  
I'm bringing it up because I want to reuse that function as state above :)
  
quoted
But you've made me look at d3378e86d182 and I am wondering whether this
is really a valid patch. First of all it can leak a reference count
AFAICS. Moreover it doesn't really fix anything because the page can be
marked hwpoison right after the check is done. I do not think the race
is feasible to be closed. So shouldn't we rather revert it?  
I am not sure if we really care about races here that much here? I mean,
essentially we are racing with HW breaking asynchronously. Just because we
would be synchronizing with SetPageHWPoison() wouldn't mean we can stop HW
from breaking.  
Right
quoted
Long story short, this should be good enough for the cases we actually can
handle? What am I missing?  
I am not sure I follow. My point is that I fail to see any added value
of the check as it doesn't prevent the race (it fundamentally cannot as
the page can be poisoned at any time) but the failure path doesn't
put_page which is incorrect even for hwpoison pages.
Sorry, I have something to say:

I have noticed the ref count leak in the previous topic ,but  I don't think
it's a really matter. For memory recovery case for user pages, we will keep one
reference to the poison page so the error page will not be freed to buddy allocator.
which can be checked in memory_faulure() function.
So what would happen if those pages are hwpoisoned from userspace rather
than by HW. And repeatedly so?
-- 
Michal Hocko
SUSE Labs
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help