Thread (19 messages) 19 messages, 5 authors, 2021-06-10

Re: [PATCH v2 2/3] btrfs: zoned: fix compressed writes

From: Qu Wenruo <hidden>
Date: 2021-05-23 23:09:46


On 2021/5/23 下午10:13, Josef Bacik wrote:
On 5/18/21 11:40 AM, Johannes Thumshirn wrote:
quoted
When multiple processes write data to the same block group on a
compressed
zoned filesystem, the underlying device could report I/O errors and data
corruption is possible.

This happens because on a zoned file system, compressed data writes where
sent to the device via a REQ_OP_WRITE instead of a REQ_OP_ZONE_APPEND
operation. But with REQ_OP_WRITE and parallel submission it cannot be
guaranteed that the data is always submitted aligned to the underlying
zone's write pointer.

The change to using REQ_OP_ZONE_APPEND instead of REQ_OP_WRITE on a zoned
filesystem is non intrusive on a regular file system or when
submitting to
a conventional zone on a zoned filesystem, as it is guarded by
btrfs_use_zone_append.

Reported-by: David Sterba <dsterba@suse.com>
Fixes: 9d294a685fbc ("btrfs: zoned: enable to mount ZONED incompat flag")
Signed-off-by: Johannes Thumshirn <redacted>
This one is causing panics with btrfs/027 with -o compress.  I bisected
it to something else earlier, but it was still happening today and I
bisected again and this is what popped out.  I also went the extra step
to revert the patch as I have already fucked this up once, and the
problem didn't re-occur with this patch reverted.  The panic looks like
this

May 22 00:33:16 xfstests2 kernel: BTRFS critical (device dm-9): mapping
failed logical 22429696 bio len 53248 len 49152
This is the interesting part, it means we are just one sector beyond the
stripe boundary.
Definitely a sign of changed bio submission timing.

Just like the code:

+		if (pg_index == 0 && use_append)
+			len = bio_add_zone_append_page(bio, page, PAGE_SIZE, 0);
+		else
+			len = bio_add_page(bio, page, PAGE_SIZE, 0);
+
  		page->mapping = NULL;
-		if (submit || bio_add_page(bio, page, PAGE_SIZE, 0) <
-		    PAGE_SIZE) {
+		if (submit || len < PAGE_SIZE) {

The code has changed the timing of bio_add_page().

Previously, if we have submit == true, we won't even try to call
bio_add_page().

But now, we will add the page even we're already at the stripe boundary,
thus it causes the extra sector being added to bio, and crosses stripe
boundary.

This part is already super tricky, thus I refactored
submit_extent_page() to do a better job at stripe boundary calculation.

We definitely need to make other bio_add_page() callers to use a better
helper, not only for later subpage, but also to make our lives easier.

Thanks,
Qu
May 22 00:33:16 xfstests2 kernel: ------------[ cut here ]------------
May 22 00:33:16 xfstests2 kernel: kernel BUG at fs/btrfs/volumes.c:6643!
May 22 00:33:16 xfstests2 kernel: invalid opcode: 0000 [#1] SMP NOPTI
May 22 00:33:16 xfstests2 kernel: CPU: 1 PID: 2236088 Comm: kworker/u4:4
Not tainted 5.13.0-rc2+ #240
May 22 00:33:16 xfstests2 kernel: Hardware name: QEMU Standard PC (Q35 +
ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
May 22 00:33:16 xfstests2 kernel: Workqueue: btrfs-delalloc
btrfs_work_helper
May 22 00:33:16 xfstests2 kernel: RIP: 0010:btrfs_map_bio.cold+0x58/0x5a
May 22 00:33:16 xfstests2 kernel: Code: 50 e8 6b 83 ff ff e8 5b 0d 88 ff
48 83 c4 18 e9 94 8f 88 ff 48 8b 3c 24 4c 89 f1 4c 89 fa 48 c7 c6 f8 db
62 96 e8 47 83 ff ff <0f> 0b 4c 89 e7 e8 52 1f 83 ff e9 03 98 88 ff 49
8b 7a 50 44 89 f2
May 22 00:33:16 xfstests2 kernel: RSP: 0018:ffffb310c1de7c88 EFLAGS:
00010282
May 22 00:33:16 xfstests2 kernel: RAX: 0000000000000055 RBX:
0000000000000000 RCX: 0000000000000000
May 22 00:33:16 xfstests2 kernel: RDX: ffff9b9a7bd27540 RSI:
ffff9b9a7bd18e10 RDI: ffff9b9a7bd18e10
May 22 00:33:16 xfstests2 kernel: RBP: ffff9b9a482ad7f8 R08:
0000000000000000 R09: 0000000000000000
May 22 00:33:16 xfstests2 kernel: R10: ffffb310c1de7a48 R11:
ffffffff96973748 R12: 0000000000000000
May 22 00:33:16 xfstests2 kernel: R13: ffff9b9a001e7300 R14:
000000000000d000 R15: 0000000001564000
May 22 00:33:16 xfstests2 kernel: FS:  0000000000000000(0000)
GS:ffff9b9a7bd00000(0000) knlGS:0000000000000000
May 22 00:33:16 xfstests2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
May 22 00:33:16 xfstests2 kernel: CR2: 00005621fe4566e0 CR3:
000000013943a005 CR4: 0000000000370ee0
May 22 00:33:16 xfstests2 kernel: Call Trace:
May 22 00:33:16 xfstests2 kernel:
btrfs_submit_compressed_write+0x2d7/0x470
May 22 00:33:16 xfstests2 kernel:  submit_compressed_extents+0x364/0x420
May 22 00:33:16 xfstests2 kernel:  ? lock_acquire+0x15d/0x380
May 22 00:33:16 xfstests2 kernel:  ? lock_release+0x1cd/0x2a0
May 22 00:33:16 xfstests2 kernel:  ? submit_compressed_extents+0x420/0x420
May 22 00:33:16 xfstests2 kernel:  btrfs_work_helper+0x133/0x520
May 22 00:33:16 xfstests2 kernel:  process_one_work+0x26b/0x570
May 22 00:33:16 xfstests2 kernel:  worker_thread+0x55/0x3c0
May 22 00:33:16 xfstests2 kernel:  ? process_one_work+0x570/0x570
May 22 00:33:16 xfstests2 kernel:  kthread+0x134/0x150
May 22 00:33:16 xfstests2 kernel:  ? __kthread_bind_mask+0x60/0x60
May 22 00:33:16 xfstests2 kernel:  ret_from_fork+0x1f/0x30

Thanks,

Josef
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help