Thread (27 messages) 27 messages, 7 authors, 2021-01-14

Re: [PATCH 04/10] mm, fsdax: Refactor memory-failure handler for dax mapping

From: Dan Williams <hidden>
Date: 2021-01-14 20:39:42
Also in: linux-fsdevel, linux-raid, linux-xfs, lkml, nvdimm

On Wed, Dec 30, 2020 at 8:59 AM Shiyang Ruan [off-list ref] wrote:
quoted hunk ↗ jump to hunk
The current memory_failure_dev_pagemap() can only handle single-mapped
dax page for fsdax mode.  The dax page could be mapped by multiple files
and offsets if we let reflink feature & fsdax mode work together.  So,
we refactor current implementation to support handle memory failure on
each file and offset.

Signed-off-by: Shiyang Ruan <redacted>
---
 fs/dax.c            | 21 +++++++++++
 include/linux/dax.h |  1 +
 include/linux/mm.h  |  9 +++++
 mm/memory-failure.c | 91 ++++++++++++++++++++++++++++++++++-----------
 4 files changed, 100 insertions(+), 22 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index 5b47834f2e1b..799210cfa687 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -378,6 +378,27 @@ static struct page *dax_busy_page(void *entry)
        return NULL;
 }

+/*
+ * dax_load_pfn - Load pfn of the DAX entry corresponding to a page
+ * @mapping: The file whose entry we want to load
+ * @index:   The offset where the DAX entry located in
+ *
+ * Return:   pfn of the DAX entry
+ */
+unsigned long dax_load_pfn(struct address_space *mapping, unsigned long index)
+{
+       XA_STATE(xas, &mapping->i_pages, index);
+       void *entry;
+       unsigned long pfn;
+
+       xas_lock_irq(&xas);
+       entry = xas_load(&xas);
+       pfn = dax_to_pfn(entry);
+       xas_unlock_irq(&xas);
+
+       return pfn;
+}
+
 /*
  * dax_lock_mapping_entry - Lock the DAX entry corresponding to a page
  * @page: The page whose entry we want to lock
diff --git a/include/linux/dax.h b/include/linux/dax.h
index b52f084aa643..89e56ceeffc7 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -150,6 +150,7 @@ int dax_writeback_mapping_range(struct address_space *mapping,

 struct page *dax_layout_busy_page(struct address_space *mapping);
 struct page *dax_layout_busy_page_range(struct address_space *mapping, loff_t start, loff_t end);
+unsigned long dax_load_pfn(struct address_space *mapping, unsigned long index);
 dax_entry_t dax_lock_page(struct page *page);
 void dax_unlock_page(struct page *page, dax_entry_t cookie);
 #else
diff --git a/include/linux/mm.h b/include/linux/mm.h
index db6ae4d3fb4e..db3059a1853e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1141,6 +1141,14 @@ static inline bool is_device_private_page(const struct page *page)
                page->pgmap->type == MEMORY_DEVICE_PRIVATE;
 }

+static inline bool is_device_fsdax_page(const struct page *page)
+{
+       return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
+               IS_ENABLED(CONFIG_DEVICE_PRIVATE) &&
+               is_zone_device_page(page) &&
+               page->pgmap->type == MEMORY_DEVICE_FS_DAX;
+}
+
Have a look at the recent fixes to pfn_to_online_page() vs DAX pages [1].

This above page type check is racy given that the pfn could stop being
pfn_valid() while this check is running. I think hwpoison_filter()
needs an explicit check for whether the page is already referenced or
not. For example the current call to hwpoison_filter() from
memory_failure_dev_pagemap() is safe because the page has already been
validated as ZONE_DEVICE and is safe to de-reference page->pgmap.

[1]: http://lore.kernel.org/r/161058499000.1840162.702316708443239771.stgit@dwillia2-desk3.amr.corp.intel.com (local)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help