Re: [RFC PATCH] btrfs: zoned: auto reclaim low mostly full block-groups first
From: David Sterba <hidden>
Date: 2021-09-20 15:50:23
On Tue, Sep 21, 2021 at 12:11:01AM +0900, Johannes Thumshirn wrote:
Currently auto reclaim of unusable zones reclaims the block-groups in the order they have been added to the reclaim list. Sort the list so we have the block-groups with the least amount of bytes to preserve at the beginning before starting the garbage collection loop.
Makes sense as an optimization.
quoted hunk ↗ jump to hunk
Signed-off-by: Johannes Thumshirn <redacted> --- fs/btrfs/block-group.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index 46fdef7bbe20..d90297fb99e1 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c@@ -1,5 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 +#include <linux/list_sort.h> + #include "misc.h" #include "ctree.h" #include "block-group.h"@@ -1486,6 +1488,21 @@ void btrfs_mark_bg_unused(struct btrfs_block_group *bg) spin_unlock(&fs_info->unused_bgs_lock); } +/* + * We want block groups with a low number of used bytes to be in the beginning + * of the list, so they will get reclaimed first. + */ +static int reclaim_bgs_cmp(void *unused, const struct list_head *a, + const struct list_head *b) +{ + const struct btrfs_block_group *bg1, *bg2; + + bg1 = list_entry(a, struct btrfs_block_group, bg_list); + bg2 = list_entry(b, struct btrfs_block_group, bg_list); + + return bg1->used - bg2->used; +} + void btrfs_reclaim_bgs_work(struct work_struct *work) { struct btrfs_fs_info *fs_info =@@ -1510,6 +1527,7 @@ void btrfs_reclaim_bgs_work(struct work_struct *work) } spin_lock(&fs_info->unused_bgs_lock); + list_sort(NULL, &fs_info->reclaim_bgs, reclaim_bgs_cmp);
The sort is under a spinlock, though it's probably not a highly
contended lock, I think we should try to move it outside. Something like
lock()
list_splice_init(&splice, &reclaim_bgs)
unlock()
list_sort(&splice);
while (!list_empty(splice)) {
}
We already use splice in the again_list so it could build on top of it.
OTOH, it may not be absolutelly necessary to do the sort outside of the
lock but rather because as a matter of good programming hygiene to not
introduce unnecessary delays due to contended lock here and there that
could potentially cascade further.