Re: [PATCH] mm : fix pte _PAGE_DIRTY bit when fallback migrate page
From: Robbie Ko <hidden>
Date: 2020-07-15 02:05:33
Also in:
linux-btrfs, lkml
Vlastimil Babka 於 2020/7/14 下午5:46 寫道:
On 7/13/20 3:57 AM, Robbie Ko wrote:quoted
Vlastimil Babka 於 2020/7/10 下午11:31 寫道:quoted
On 7/9/20 4:48 AM, robbieko wrote:quoted
From: Robbie Ko <redacted> When a migrate page occurs, we first create a migration entry to replace the original pte, and then go to fallback_migrate_page to execute a writeout if the migratepage is not supported. In the writeout, we will clear the dirty bit of the page and use page_mkclean to clear the dirty bit along with the corresponding pte, but page_mkclean does not support migration entry. The page ditry bit is cleared, but the dirty bit of the pte still exists, so if mmap continues to write, it will result in data loss.Curious, did you observe this data loss? What filesystem? If yes, it seems serious enough to CC stable and determine a Fixes: tag?Yes, there is data loss. I'm using a btrfs environment, but not the following patchAnd the kernel is otherwise upstream? Which version? Anyway we better let btrfs guys know (+CC) even if the fix is in MM code.
Kernel verion is 4.4. I think this is a bug that has been around for a long time. I think the problem is not limited to btrfs, as long as other fs have not implemented the migrationpage, they will encounter the problem. (Eg ecryptfs, fat, nfs...)
quoted
btrfs: implement migratepage callback for data pages https://git.kernel.org/pub/scm/linux/kernel /git/torvalds/linux.git/commit/?h=v5.8-rc5& id=f8e6608180a31cc72a23b74969da428da236dbd1That's a new commit, so if this is really affecting upstream btrfs pre-5.8 we should either backport that commit, or your fix (after review).quoted
quoted
quoted
We fix the by first remove the migration entry and then clearing the dirty bits of the page, which also clears the pte's dirty bits. Signed-off-by: Robbie Ko <redacted> --- mm/migrate.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)diff --git a/mm/migrate.c b/mm/migrate.c index f37729673558..5c407434b9ba 100644 --- a/mm/migrate.c +++ b/mm/migrate.c@@ -875,10 +875,6 @@ static int writeout(struct address_space *mapping, struct page *page) /* No write method for the address space */ return -EINVAL; - if (!clear_page_dirty_for_io(page)) - /* Someone else already triggered a write */ - return -EAGAIN; - /* * A dirty page may imply that the underlying filesystem has * the page on some queue. So the page must be clean for@@ -889,6 +885,10 @@ static int writeout(struct address_space *mapping, struct page *page) */ remove_migration_ptes(page, page, false); + if (!clear_page_dirty_for_io(page)) + /* Someone else already triggered a write */ + return -EAGAIN; + rc = mapping->a_ops->writepage(page, &wbc); if (rc != AOP_WRITEPAGE_ACTIVATE)