[PATCH 00/17] ZNS Support for Btrfs
From: Naohiro Aota <naohiro.aota@wdc.com>
Date: 2021-08-11 14:20:52
This series extends zoned support for Zoned Namespace (ZNS) SSDs [1]. [1] https://zonedstorage.io/introduction/zns/ This series is available on GitHub at v1 https://github.com/naota/linux/tree/btrfs-zns-v1 HEAD https://github.com/naota/linux/tree/btrfs-zns The ZNS specification introduces extra functionalities listed below. - No conventional zones - Zone Append write command - Zone Capacity - Active Zones The first two functionalities are already addressed in the current zoned support on btrfs. We do not rely on conventional zones, and we use the zone append write command to write data IOs. This series implements support for the other ones. While userland tool needs some tweaks (e.g. using capactiy instead of the length) to be precise, but it still works fine as it is. * Zone Capacity Support A zone capacity is an additional per-zone attribute that indicates the number of usable logical blocks within each zone, starting from the first logical block of each zone. It is always smaller or equal to the zone size. We can naturally map the capacity to the newly introduced "zone_capacity" of a block group. Allocations are limited under the zone capacity instead of the block group's length. * Active Zones Tracking The ZNS specification defines a limit on the number of zones that can be in the implicit open, explicit open or closed conditions. Any zone with such condition is defined as an active zone and correspond to any zone that is being written or that has been only partially written. If the maximum number of active zones is reached, we must either reset or finish some active zones before being able to chose other zones for storing data. In order to not exceed the number of max active zones, we need to track which zones are active and how the active zones are related to the block groups. We mark a block group as "active" if the corresponding device zones are all active. Allocating an extent will activate a block group, and allocation from an inactive block group is prohibited. Such active block groups are tracked in a list. Once a block group is fully written, we deactivate it and remove it from the list. * Active Zone Aware Sequential Allocator Handling the active zones will make the allocator complex. Here is a summary of how find_free_extent_update_loop() behave. 1. If enough space is available in an active block group - allocate from it (end, success) 2. If we can activate another zone on a device 2.1 Try to allocate a new block group and activate it 2.2 If the activation succeeds - allocation will be satisfied from it in the next iteration 2.3 If the activation failed - Try the next cycle. Some writes may free up an active block group 3. If we cannot activate any zones 3.1 Try to allocate in a small size by checking min_alloc_size - btrfs_reserve_extent() will halve the allocation size and restart the loop 3.2 Nothing can be done anymore. Give up. ENOSPC * Patch series organization Note: patches 2 and 14 are preparation patches and can be merged independently. Patches 1-6 implement zone capacity support. Patch 7 implements finishing a superblock zone once there is no space left for new superblock. Patches 8-13 implement the activation side of the active zone tracking. Patches 14 and 15 tweak the allocator to retry with a smaller size if possible (step 3.1 in the above list) Patches 16 and 17 implement the deactivation side of the active zone tracking. Naohiro Aota (17): btrfs: zoned: load zone capacity information from devices btrfs: zoned: move btrfs_free_excluded_extents out from btrfs_calc_zone_unusable btrfs: zoned: calculate free space from zone capacity btrfs: zoned: tweak reclaim threshold for zone capacity btrfs: zoned: consider zone as full when no more SB can be written btrfs: zoned: locate superblock position using zone capacity btrfs: zoned: finish superblock zone once no space left for new SB btrfs: zoned: load active zone information from devices btrfs: zoned: introduce physical_map to btrfs_block_group btrfs: zoned: implement active zone tracking btrfs: zoned: load active zone info for block group btrfs: zoned: activate block group on allocation btrfs: zoned: activate new block group btrfs: move ffe_ctl one level up btrfs: zoned: avoid chunk allocation if active block group has enough space btrfs: zoned: finish fully written block group btrfs: zoned: finish relocating block group fs/btrfs/block-group.c | 29 ++- fs/btrfs/block-group.h | 4 + fs/btrfs/ctree.h | 3 + fs/btrfs/disk-io.c | 6 +- fs/btrfs/extent-tree.c | 204 +++++++++------ fs/btrfs/extent_io.c | 11 +- fs/btrfs/extent_io.h | 1 + fs/btrfs/free-space-cache.c | 19 +- fs/btrfs/inode.c | 6 +- fs/btrfs/relocation.c | 4 + fs/btrfs/zoned.c | 495 +++++++++++++++++++++++++++++++++--- fs/btrfs/zoned.h | 39 ++- 12 files changed, 692 insertions(+), 129 deletions(-) -- 2.32.0