Re: [PATCH v1] mm/hwpoison: Retry with shake_page() for unhandlable pages
From: Yang Shi <hidden>
Date: 2021-08-18 18:08:00
Also in:
lkml
On Mon, Aug 16, 2021 at 10:37 PM Naoya Horiguchi [off-list ref] wrote:
quoted hunk ↗ jump to hunk
From: Naoya Horiguchi <redacted> HWPoisonHandlable() sometimes returns false for typical user pages due to races with average memory events like transfers over LRU lists. This causes failures in hwpoison handling. There's retry code for such a case but does not work because the retry loop reaches the retry limit too quickly before the page settles down to handlable state. Let get_any_page() call shake_page() to fix it. Fixes: 25182f05ffed ("mm,hwpoison: fix race with hugetlb page allocation") Reported-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Naoya Horiguchi <redacted> Cc: stable@vger.kernel.org # 5.13 --- mm/memory-failure.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)diff --git v5.14-rc6/mm/memory-failure.c v5.14-rc6_patched/mm/memory-failure.c index eefd823deb67..aa6592540f17 100644 --- v5.14-rc6/mm/memory-failure.c +++ v5.14-rc6_patched/mm/memory-failure.c@@ -1146,7 +1146,7 @@ static int __get_hwpoison_page(struct page *page) * unexpected races caused by taking a page refcount. */ if (!HWPoisonHandlable(head)) - return 0; + return -EBUSY; if (PageTransHuge(head)) { /*@@ -1199,9 +1199,14 @@ static int get_any_page(struct page *p, unsigned long flags) } goto out; } else if (ret == -EBUSY) { - /* We raced with freeing huge page to buddy, retry. */ - if (pass++ < 3) + /* + * We raced with (possibly temporary) unhandlable + * page, retry. + */ + if (pass++ < 3) { + shake_page(p, 1); goto try_again; + }
I think the return value should be set to -EIO before jumping to out. I'm supposed -EIO means this is really or very likely an unhandlable page.
goto out;
}
}
--
2.25.1