Thread (40 messages) 40 messages, 5 authors, 2021-05-31

Re: [PATCH v3 27/31] btrfs: fix a crash caused by race between prepare_pages() and btrfs_releasepage()

From: Filipe Manana <hidden>
Date: 2021-05-24 10:56:30

On Fri, May 21, 2021 at 9:08 PM Qu Wenruo [off-list ref] wrote:
[BUG]
When running generic/095, there is a high chance to crash with subpage
data RW support:
 assertion failed: PagePrivate(page) && page->private, in fs/btrfs/subpage.c:171
 ------------[ cut here ]------------
 kernel BUG at fs/btrfs/ctree.h:3403!
 Internal error: Oops - BUG: 0 [#1] SMP
 CPU: 1 PID: 3567 Comm: fio Tainted: G         C O      5.12.0-rc7-custom+ #17
 Hardware name: Khadas VIM3 (DT)
 Call trace:
  assertfail.constprop.0+0x28/0x2c [btrfs]
  btrfs_subpage_assert+0x80/0xa0 [btrfs]
  btrfs_subpage_set_uptodate+0x34/0xec [btrfs]
  btrfs_page_clamp_set_uptodate+0x74/0xa4 [btrfs]
  btrfs_dirty_pages+0x160/0x270 [btrfs]
  btrfs_buffered_write+0x444/0x630 [btrfs]
  btrfs_direct_write+0x1cc/0x2d0 [btrfs]
  btrfs_file_write_iter+0xc0/0x160 [btrfs]
  new_sync_write+0xe8/0x180
  vfs_write+0x1b4/0x210
  ksys_pwrite64+0x7c/0xc0
  __arm64_sys_pwrite64+0x24/0x30
  el0_svc_common.constprop.0+0x70/0x140
  do_el0_svc+0x28/0x90
  el0_svc+0x2c/0x54
  el0_sync_handler+0x1a8/0x1ac
  el0_sync+0x170/0x180
 Code: f0000160 913be042 913c4000 955444bc (d4210000)
 ---[ end trace 3fdd39f4cccedd68 ]---

[CAUSE]
Although prepare_pages() calls find_or_create_page(), which returns the
page locked, but in later prepare_uptodate_page() calls, we may call
btrfs_readpage() which unlocked the page.

This leaves a window where btrfs_releasepage() can sneak in and release
the page.

This can be proven by the dying ftrace dump:
 fio-3567 : prepare_pages: r/i=5/257 page_offset=262144 private=1 after set extent map
 fio-3536 : __btrfs_releasepage.part.0: r/i=5/257 page_offset=262144 private=1 clear extent map
 fio-3567 : prepare_uptodate_page.part.0: r/i=5/257 page_offset=262144 private=0 after readpage
 fio-3567 : btrfs_dirty_pages: r/i=5/257 page_offset=262144 private=0  NOT PRIVATE
Pasting here the tracing results form some custom tracepoints you
added for your own debugging does not add that much value IMHO, anyone
reading this will not know the exact places where the tracepoints were
added,
plus the previous explanation is clear enough.
quoted hunk ↗ jump to hunk
[FIX]
In prepare_uptodate_page(), we should not only check page->mapping, but
also PagePrivate() to ensure we are still hold a correct page which has
proper fs context setup.

Reported-by: Ritesh Harjani <redacted>
Tested-by: Ritesh Harjani <redacted>
Signed-off-by: Qu Wenruo <redacted>
---
 fs/btrfs/file.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 6ef44afa939c..a4c092028bb6 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1341,7 +1341,17 @@ static int prepare_uptodate_page(struct inode *inode,
                        unlock_page(page);
                        return -EIO;
                }
-               if (page->mapping != inode->i_mapping) {
+
+               /*
+                * Since btrfs_readpage() will get the page unlocked, we have
+                * a window where fadvice() can try to release the page.
+                * Here we check both inode mapping and PagePrivate() to
+                * make sure the page is not released.
+                *
+                * The priavte flag check is essential for subpage as we need
+                * to store extra bitmap using page->private.
+                */
+               if (page->mapping != inode->i_mapping || !PagePrivate(page)) {
My comments from v1 still apply here:

https://lore.kernel.org/linux-btrfs/CAL3q7H5P79kEqWUnN2QKG92N3u7+G0uWbmeC0yT1LypV63MAYA@mail.gmail.com/ (local)

The code looks good.
Thanks.
                        unlock_page(page);
                        return -EAGAIN;
                }
--
2.31.1

-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help