Re: [PATCH 4/4] xfs: use separate btree cursor slab for each btree type
From: Dave Chinner <david@fromorbit.com>
Date: 2021-09-27 22:02:01
On Mon, Sep 27, 2021 at 11:21:22AM -0700, Darrick J. Wong wrote:
On Sun, Sep 26, 2021 at 10:47:21AM +1000, Dave Chinner wrote:quoted
On Thu, Sep 23, 2021 at 06:27:59PM -0700, Darrick J. Wong wrote:quoted
From: Darrick J. Wong <djwong@kernel.org> Now that we have the infrastructure to track the max possible height of each btree type, we can create a separate slab zone for cursors of each type of btree. For smaller indices like the free space btrees, this means that we can pack more cursors into a slab page, improving slab utilization. Signed-off-by: Darrick J. Wong <djwong@kernel.org> --- fs/xfs/libxfs/xfs_btree.c | 12 ++++++------ fs/xfs/libxfs/xfs_btree.h | 9 +-------- fs/xfs/xfs_super.c | 33 ++++++++++++++++++++++++--------- 3 files changed, 31 insertions(+), 23 deletions(-)diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 120280c998f8..3131de9ae631 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c@@ -26,7 +26,6 @@ /* * Cursor allocation zone. */ -kmem_zone_t *xfs_btree_cur_zone; struct xfs_btree_cur_zone xfs_btree_cur_zones[XFS_BTNUM_MAX] = { [XFS_BTNUM_BNO] = { .name = "xfs_alloc_btree_cur" }, [XFS_BTNUM_INO] = { .name = "xfs_ialloc_btree_cur" },@@ -364,6 +363,7 @@ xfs_btree_del_cursor( struct xfs_btree_cur *cur, /* btree cursor */ int error) /* del because of error */ { + struct xfs_btree_cur_zone *bczone = &xfs_btree_cur_zones[cur->bc_btnum]; int i; /* btree level */ /*@@ -386,10 +386,10 @@ xfs_btree_del_cursor( kmem_free(cur->bc_ops); if (!(cur->bc_flags & XFS_BTREE_LONG_PTRS) && cur->bc_ag.pag) xfs_perag_put(cur->bc_ag.pag); - if (cur->bc_maxlevels > XFS_BTREE_CUR_ZONE_MAXLEVELS) + if (cur->bc_maxlevels > bczone->maxlevels) kmem_free(cur); else - kmem_cache_free(xfs_btree_cur_zone, cur); + kmem_cache_free(bczone->zone, cur); } /*@@ -5021,12 +5021,12 @@ xfs_btree_alloc_cursor( { struct xfs_btree_cur *cur; unsigned int maxlevels = xfs_btree_maxlevels(mp, btnum); + struct xfs_btree_cur_zone *bczone = &xfs_btree_cur_zones[btnum]; - if (maxlevels > XFS_BTREE_CUR_ZONE_MAXLEVELS) + if (maxlevels > bczone->maxlevels) cur = kmem_zalloc(xfs_btree_cur_sizeof(maxlevels), KM_NOFS); else - cur = kmem_cache_zalloc(xfs_btree_cur_zone, - GFP_NOFS | __GFP_NOFAIL); + cur = kmem_cache_zalloc(bczone->zone, GFP_NOFS | __GFP_NOFAIL);When will maxlevels ever be greater than bczone->maxlevels? Isn't the bczone->maxlevels case always supposed to be the tallest possible height for that btree?It should never happen, provided that the maxlevels computation and verification are all correct. I thought it was important to leave the heap allocation in here as a fallback, since the consequence for getting the size calculations wrong is corrupt kernel memory.
I think that this is the wrong approach. Static debug-only testing of btree size calculations at init time is needed here, not runtime fallbacks that hide the fact that we got fundamental calculations wrong. A mistake here should be loud and obvious, not hidden away in a fallback path that might never, ever be hit in the real world. Cheers, Dave. -- Dave Chinner david@fromorbit.com