Thread (10 messages) 10 messages, 3 authors, 2021-06-01

Re: [PATCH] btrfs: zoned: limit ordered extent to zoned append size

From: David Sterba <hidden>
Date: 2021-05-21 16:39:42

On Fri, May 21, 2021 at 06:11:04PM +0900, Johannes Thumshirn wrote:
quoted hunk ↗ jump to hunk
Damien reported a test failure with btrfs/209. The test itself ran fine,
but the fsck run afterwards reported a corrupted filesystem.

The filesystem corruption happens because we're splitting an extent and
then writing the extent twice. We have to split the extent though, because
we're creating too large extents for a REQ_OP_ZONE_APPEND operation.

When dumping the extent tree, we can see two EXTENT_ITEMs at the same
start address but different lengths.

$ btrfs inspect dump-tree /dev/nullb1 -t extent
...
   item 19 key (269484032 EXTENT_ITEM 126976) itemoff 15470 itemsize 53
           refs 1 gen 7 flags DATA
           extent data backref root FS_TREE objectid 257 offset 786432 count 1
   item 20 key (269484032 EXTENT_ITEM 262144) itemoff 15417 itemsize 53
           refs 1 gen 7 flags DATA
           extent data backref root FS_TREE objectid 257 offset 786432 count 1

On a zoned filesystem, limit the size of an ordered extent to the maximum
size that can be issued as a single REQ_OP_ZONE_APPEND operation.

Note: This patch breaks fstests btrfs/079, as it increases the number of
on-disk extents from 80 to 83 per 10M write.

Reported-by: Damien Le Moal <redacted>
Signed-off-by: Johannes Thumshirn <redacted>
---
 fs/btrfs/extent_io.c | 4 ++++
 1 file changed, 4 insertions(+)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 78d3f2ec90e0..e823b2c74af5 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1860,6 +1860,7 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
 				    u64 *end)
 {
 	struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree;
+	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 	u64 max_bytes = BTRFS_MAX_EXTENT_SIZE;
 	u64 delalloc_start;
 	u64 delalloc_end;
@@ -1868,6 +1869,9 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
 	int ret;
 	int loops = 0;
 
+	if (fs_info && fs_info->max_zone_append_size)
+		max_bytes = ALIGN_DOWN(fs_info->max_zone_append_size,
+				       PAGE_SIZE);
Why is the alignment needed? Are the max zone append values expected to
be so random? Also it's using memory-related value for something that's
more hw related, or at least extent size (which ends up on disk).
 again:
 	/* step one, find a bunch of delalloc bytes starting at start */
 	delalloc_start = *start;
-- 
2.31.1
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help