Re: [PATCH v2 2/3] btrfs: zoned: fix compressed writes
From: Qu Wenruo <hidden>
Date: 2021-05-23 23:09:46
On 2021/5/23 下午10:13, Josef Bacik wrote:
On 5/18/21 11:40 AM, Johannes Thumshirn wrote:quoted
When multiple processes write data to the same block group on a compressed zoned filesystem, the underlying device could report I/O errors and data corruption is possible. This happens because on a zoned file system, compressed data writes where sent to the device via a REQ_OP_WRITE instead of a REQ_OP_ZONE_APPEND operation. But with REQ_OP_WRITE and parallel submission it cannot be guaranteed that the data is always submitted aligned to the underlying zone's write pointer. The change to using REQ_OP_ZONE_APPEND instead of REQ_OP_WRITE on a zoned filesystem is non intrusive on a regular file system or when submitting to a conventional zone on a zoned filesystem, as it is guarded by btrfs_use_zone_append. Reported-by: David Sterba <dsterba@suse.com> Fixes: 9d294a685fbc ("btrfs: zoned: enable to mount ZONED incompat flag") Signed-off-by: Johannes Thumshirn <redacted>This one is causing panics with btrfs/027 with -o compress. I bisected it to something else earlier, but it was still happening today and I bisected again and this is what popped out. I also went the extra step to revert the patch as I have already fucked this up once, and the problem didn't re-occur with this patch reverted. The panic looks like this May 22 00:33:16 xfstests2 kernel: BTRFS critical (device dm-9): mapping failed logical 22429696 bio len 53248 len 49152
This is the interesting part, it means we are just one sector beyond the
stripe boundary.
Definitely a sign of changed bio submission timing.
Just like the code:
+ if (pg_index == 0 && use_append)
+ len = bio_add_zone_append_page(bio, page, PAGE_SIZE, 0);
+ else
+ len = bio_add_page(bio, page, PAGE_SIZE, 0);
+
page->mapping = NULL;
- if (submit || bio_add_page(bio, page, PAGE_SIZE, 0) <
- PAGE_SIZE) {
+ if (submit || len < PAGE_SIZE) {
The code has changed the timing of bio_add_page().
Previously, if we have submit == true, we won't even try to call
bio_add_page().
But now, we will add the page even we're already at the stripe boundary,
thus it causes the extra sector being added to bio, and crosses stripe
boundary.
This part is already super tricky, thus I refactored
submit_extent_page() to do a better job at stripe boundary calculation.
We definitely need to make other bio_add_page() callers to use a better
helper, not only for later subpage, but also to make our lives easier.
Thanks,
QuMay 22 00:33:16 xfstests2 kernel: ------------[ cut here ]------------ May 22 00:33:16 xfstests2 kernel: kernel BUG at fs/btrfs/volumes.c:6643! May 22 00:33:16 xfstests2 kernel: invalid opcode: 0000 [#1] SMP NOPTI May 22 00:33:16 xfstests2 kernel: CPU: 1 PID: 2236088 Comm: kworker/u4:4 Not tainted 5.13.0-rc2+ #240 May 22 00:33:16 xfstests2 kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 May 22 00:33:16 xfstests2 kernel: Workqueue: btrfs-delalloc btrfs_work_helper May 22 00:33:16 xfstests2 kernel: RIP: 0010:btrfs_map_bio.cold+0x58/0x5a May 22 00:33:16 xfstests2 kernel: Code: 50 e8 6b 83 ff ff e8 5b 0d 88 ff 48 83 c4 18 e9 94 8f 88 ff 48 8b 3c 24 4c 89 f1 4c 89 fa 48 c7 c6 f8 db 62 96 e8 47 83 ff ff <0f> 0b 4c 89 e7 e8 52 1f 83 ff e9 03 98 88 ff 49 8b 7a 50 44 89 f2 May 22 00:33:16 xfstests2 kernel: RSP: 0018:ffffb310c1de7c88 EFLAGS: 00010282 May 22 00:33:16 xfstests2 kernel: RAX: 0000000000000055 RBX: 0000000000000000 RCX: 0000000000000000 May 22 00:33:16 xfstests2 kernel: RDX: ffff9b9a7bd27540 RSI: ffff9b9a7bd18e10 RDI: ffff9b9a7bd18e10 May 22 00:33:16 xfstests2 kernel: RBP: ffff9b9a482ad7f8 R08: 0000000000000000 R09: 0000000000000000 May 22 00:33:16 xfstests2 kernel: R10: ffffb310c1de7a48 R11: ffffffff96973748 R12: 0000000000000000 May 22 00:33:16 xfstests2 kernel: R13: ffff9b9a001e7300 R14: 000000000000d000 R15: 0000000001564000 May 22 00:33:16 xfstests2 kernel: FS: 0000000000000000(0000) GS:ffff9b9a7bd00000(0000) knlGS:0000000000000000 May 22 00:33:16 xfstests2 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 22 00:33:16 xfstests2 kernel: CR2: 00005621fe4566e0 CR3: 000000013943a005 CR4: 0000000000370ee0 May 22 00:33:16 xfstests2 kernel: Call Trace: May 22 00:33:16 xfstests2 kernel: btrfs_submit_compressed_write+0x2d7/0x470 May 22 00:33:16 xfstests2 kernel: submit_compressed_extents+0x364/0x420 May 22 00:33:16 xfstests2 kernel: ? lock_acquire+0x15d/0x380 May 22 00:33:16 xfstests2 kernel: ? lock_release+0x1cd/0x2a0 May 22 00:33:16 xfstests2 kernel: ? submit_compressed_extents+0x420/0x420 May 22 00:33:16 xfstests2 kernel: btrfs_work_helper+0x133/0x520 May 22 00:33:16 xfstests2 kernel: process_one_work+0x26b/0x570 May 22 00:33:16 xfstests2 kernel: worker_thread+0x55/0x3c0 May 22 00:33:16 xfstests2 kernel: ? process_one_work+0x570/0x570 May 22 00:33:16 xfstests2 kernel: kthread+0x134/0x150 May 22 00:33:16 xfstests2 kernel: ? __kthread_bind_mask+0x60/0x60 May 22 00:33:16 xfstests2 kernel: ret_from_fork+0x1f/0x30 Thanks, Josef