Thread (19 messages) 19 messages, 5 authors, 2021-06-10

Re: [PATCH v2 2/3] btrfs: zoned: fix compressed writes

From: Qu Wenruo <hidden>
Date: 2021-05-24 13:04:51
Subsystem: btrfs file system, filesystems (vfs and infrastructure), the rest · Maintainers: Chris Mason, David Sterba, Alexander Viro, Christian Brauner, Linus Torvalds


On 2021/5/24 上午7:09, Qu Wenruo wrote:

On 2021/5/23 下午10:13, Josef Bacik wrote:
quoted
On 5/18/21 11:40 AM, Johannes Thumshirn wrote:
quoted
When multiple processes write data to the same block group on a
compressed
zoned filesystem, the underlying device could report I/O errors and data
corruption is possible.

This happens because on a zoned file system, compressed data writes
where
sent to the device via a REQ_OP_WRITE instead of a REQ_OP_ZONE_APPEND
operation. But with REQ_OP_WRITE and parallel submission it cannot be
guaranteed that the data is always submitted aligned to the underlying
zone's write pointer.

The change to using REQ_OP_ZONE_APPEND instead of REQ_OP_WRITE on a
zoned
filesystem is non intrusive on a regular file system or when
submitting to
a conventional zone on a zoned filesystem, as it is guarded by
btrfs_use_zone_append.

Reported-by: David Sterba <dsterba@suse.com>
Fixes: 9d294a685fbc ("btrfs: zoned: enable to mount ZONED incompat
flag")
Signed-off-by: Johannes Thumshirn <redacted>
This one is causing panics with btrfs/027 with -o compress.  I bisected
it to something else earlier, but it was still happening today and I
bisected again and this is what popped out.  I also went the extra step
to revert the patch as I have already fucked this up once, and the
problem didn't re-occur with this patch reverted.  The panic looks like
this

May 22 00:33:16 xfstests2 kernel: BTRFS critical (device dm-9): mapping
failed logical 22429696 bio len 53248 len 49152
This is the interesting part, it means we are just one sector beyond the
stripe boundary.
Definitely a sign of changed bio submission timing.

Just like the code:

+        if (pg_index == 0 && use_append)
+            len = bio_add_zone_append_page(bio, page, PAGE_SIZE, 0);
+        else
+            len = bio_add_page(bio, page, PAGE_SIZE, 0);
+
          page->mapping = NULL;
-        if (submit || bio_add_page(bio, page, PAGE_SIZE, 0) <
-            PAGE_SIZE) {
+        if (submit || len < PAGE_SIZE) {

The code has changed the timing of bio_add_page().

Previously, if we have submit == true, we won't even try to call
bio_add_page().

But now, we will add the page even we're already at the stripe boundary,
thus it causes the extra sector being added to bio, and crosses stripe
boundary.

This part is already super tricky, thus I refactored
submit_extent_page() to do a better job at stripe boundary calculation.
BTW, I can also reproduce the problem in btrfs/027 using the latest
misc-next branch.

Thus to workaround the problem, I'm using the following diff, feel free
to fold in to the offending patch.

Thanks,
Qu
diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 831e6ae92940..26e2dceda1fc 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -455,10 +455,13 @@ blk_status_t btrfs_submit_compressed_write(struct
btrfs_inode *inode, u64 start,
                         submit = btrfs_bio_fits_in_stripe(page,
PAGE_SIZE, bio,
                                                           0);

-               if (pg_index == 0 && use_append)
-                       len = bio_add_zone_append_page(bio, page,
PAGE_SIZE, 0);
-               else
-                       len = bio_add_page(bio, page, PAGE_SIZE, 0);
+               if (!submit) {
+                       if (pg_index == 0 && use_append)
+                               len = bio_add_zone_append_page(bio, page,
+
PAGE_SIZE, 0);
+                       else
+                               len = bio_add_page(bio, page, PAGE_SIZE, 0);
+               }

                 page->mapping = NULL;
                 if (submit || len < PAGE_SIZE) {
We definitely need to make other bio_add_page() callers to use a better
helper, not only for later subpage, but also to make our lives easier.

Thanks,
Qu
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help