[PATCH v2 2/2] btrfs: update comments for chunk allocation -ENOSPC cases

[PATCH 0/2] btrfs: fix a deadlock between chunk allocation and chunk tree modifications · fdmanana@kernel.org · 2021-10-07
[PATCH 1/2] btrfs: fix deadlock between chunk allocation and chunk btree modifications · fdmanana@kernel.org · 2021-10-07
[PATCH 2/2] btrfs: update comments for chunk allocation -ENOSPC cases · fdmanana@kernel.org · 2021-10-07
[PATCH v2 0/2] btrfs: fix a deadlock between chunk allocation and chunk tree modifications · fdmanana@kernel.org · 2021-10-08
[PATCH v2 1/2] btrfs: fix deadlock between chunk allocation and chunk btree modifications · fdmanana@kernel.org · 2021-10-08
Re: [PATCH v2 1/2] btrfs: fix deadlock between chunk allocation and chunk btree modifications · Josef Bacik <josef@toxicpanda.com> · 2021-10-11
Re: [PATCH v2 1/2] btrfs: fix deadlock between chunk allocation and chunk btree modifications · Filipe Manana <fdmanana@kernel.org> · 2021-10-11
Re: [PATCH v2 1/2] btrfs: fix deadlock between chunk allocation and chunk btree modifications · Josef Bacik <josef@toxicpanda.com> · 2021-10-11
Re: [PATCH v2 1/2] btrfs: fix deadlock between chunk allocation and chunk btree modifications · Filipe Manana <fdmanana@kernel.org> · 2021-10-11
Re: [PATCH v2 1/2] btrfs: fix deadlock between chunk allocation and chunk btree modifications · Josef Bacik <josef@toxicpanda.com> · 2021-10-11
Re: [PATCH v2 1/2] btrfs: fix deadlock between chunk allocation and chunk btree modifications · Filipe Manana <fdmanana@kernel.org> · 2021-10-11
Re: [PATCH v2 1/2] btrfs: fix deadlock between chunk allocation and chunk btree modifications · Josef Bacik <josef@toxicpanda.com> · 2021-10-12
Re: [PATCH v2 1/2] btrfs: fix deadlock between chunk allocation and chunk btree modifications · Filipe Manana <fdmanana@kernel.org> · 2021-10-13
[PATCH v2 2/2] btrfs: update comments for chunk allocation -ENOSPC cases · fdmanana@kernel.org · 2021-10-08
[PATCH v3 0/2] btrfs: fix a deadlock between chunk allocation and chunk tree modifications · fdmanana@kernel.org · 2021-10-13
[PATCH v3 1/2] btrfs: fix deadlock between chunk allocation and chunk btree modifications · fdmanana@kernel.org · 2021-10-13
Re: [PATCH v3 1/2] btrfs: fix deadlock between chunk allocation and chunk btree modifications · Nikolay Borisov <hidden> · 2021-10-13
Re: [PATCH v3 1/2] btrfs: fix deadlock between chunk allocation and chunk btree modifications · Filipe Manana <fdmanana@kernel.org> · 2021-10-13
Re: [PATCH v3 1/2] btrfs: fix deadlock between chunk allocation and chunk btree modifications · David Sterba <hidden> · 2021-10-18
Re: [PATCH v3 1/2] btrfs: fix deadlock between chunk allocation and chunk btree modifications · Josef Bacik <josef@toxicpanda.com> · 2021-10-14
[PATCH v3 2/2] btrfs: update comments for chunk allocation -ENOSPC cases · fdmanana@kernel.org · 2021-10-13
Re: [PATCH v3 2/2] btrfs: update comments for chunk allocation -ENOSPC cases · Josef Bacik <josef@toxicpanda.com> · 2021-10-14
Re: [PATCH v3 0/2] btrfs: fix a deadlock between chunk allocation and chunk tree modifications · David Sterba <hidden> · 2021-10-18

STALE1711d

Revisions (3)

2021-10-07 v1 [diff vs current]
2021-10-08 v2 current
2021-10-13 v3 [diff vs current]

From: fdmanana@kernel.org
Date: 2021-10-08 15:10:45
Subsystem: btrfs file system, filesystems (vfs and infrastructure), the rest · Maintainers: Chris Mason, David Sterba, Alexander Viro, Christian Brauner, Linus Torvalds

From: Filipe Manana <redacted>

Update the comments at btrfs_chunk_alloc() and do_chunk_alloc() that
describe which cases can lead to a failure to allocate metadata and system
space despite having previously reserved space. This adds one more reason
that I previously forgot to mention.

Signed-off-by: Filipe Manana <redacted>
---
 fs/btrfs/block-group.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 8ed36d57da31..282046ef1a81 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c

@@ -3409,7 +3409,7 @@ static int do_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags)
 	/*
 	 * Normally we are not expected to fail with -ENOSPC here, since we have
 	 * previously reserved space in the system space_info and allocated one
-	 * new system chunk if necessary. However there are two exceptions:
+	 * new system chunk if necessary. However there are three exceptions:
 	 *
 	 * 1) We may have enough free space in the system space_info but all the
 	 *    existing system block groups have a profile which can not be used

@@ -3435,7 +3435,14 @@ static int do_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags)
 	 *    with enough free space got turned into RO mode by a running scrub,
 	 *    and in this case we have to allocate a new one and retry. We only
 	 *    need do this allocate and retry once, since we have a transaction
-	 *    handle and scrub uses the commit root to search for block groups.
+	 *    handle and scrub uses the commit root to search for block groups;
+	 *
+	 * 3) We had one system block group with enough free space when we called
+	 *    check_system_chunk(), but after that, right before we tried to
+	 *    allocate the last extent buffer we needed, a discard operation came
+	 *    in and it temporarily removed the last free space entry from the
+	 *    block group (discard removes a free space entry, discards it, and
+	 *    then adds back the entry to the block group cache).
 	 */
 	if (ret == -ENOSPC) {
 		const u64 sys_flags = btrfs_system_alloc_profile(trans->fs_info);

@@ -3519,7 +3526,15 @@ static int do_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags)
  *    properly, either intentionally or as a bug. One example where this is
  *    done intentionally is fsync, as it does not reserve any transaction units
  *    and ends up allocating a variable number of metadata extents for log
- *    tree extent buffers.
+ *    tree extent buffers;
+ *
+ * 4) The task has reserved enough transaction units / metadata space, but right
+ *    before it tries to allocate the last extent buffer it needs, a discard
+ *    operation comes in and, temporarily, removes the last free space entry from
+ *    the only metadata block group that had free space (discard starts by
+ *    removing a free space entry from a block group, then does the discard
+ *    operation and, once it's done, it adds back the free space entry to the
+ *    block group).
  *
  * We also need this 2 phases setup when adding a device to a filesystem with
  * a seed device - we must create new metadata and system chunks without adding

-- 
2.33.0

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help