Re: [PATCH 05/15] xfs: support dynamic btree cursor heights
From: Dave Chinner <david@fromorbit.com>
Date: 2021-10-13 05:31:28
On Tue, Oct 12, 2021 at 04:33:01PM -0700, Darrick J. Wong wrote:
From: Darrick J. Wong <djwong@kernel.org> Split out the btree level information into a separate struct and put it at the end of the cursor structure as a VLA. The realtime rmap btree (which is rooted in an inode) will require the ability to support many more levels than a per-AG btree cursor, which means that we're going to create two btree cursor caches to conserve memory for the more common case. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Chandan Babu R <redacted> Reviewed-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/libxfs/xfs_alloc.c | 6 +- fs/xfs/libxfs/xfs_bmap.c | 10 +-- fs/xfs/libxfs/xfs_btree.c | 168 +++++++++++++++++++++++---------------------- fs/xfs/libxfs/xfs_btree.h | 28 ++++++-- fs/xfs/scrub/bitmap.c | 22 +++--- fs/xfs/scrub/bmap.c | 2 - fs/xfs/scrub/btree.c | 47 +++++++------ fs/xfs/scrub/trace.c | 7 +- fs/xfs/scrub/trace.h | 10 +-- fs/xfs/xfs_super.c | 2 - fs/xfs/xfs_trace.h | 2 - 11 files changed, 164 insertions(+), 140 deletions(-)
Hmmm - subject of the patch doesn't really match the changes being made - there's nothing here that makes the btree cursor heights dynamic. It's just a structure layout change...
quoted hunk ↗ jump to hunk
@@ -415,9 +415,9 @@ xfs_btree_dup_cursor( * For each level current, re-get the buffer and copy the ptr value. */ for (i = 0; i < new->bc_nlevels; i++) { - new->bc_ptrs[i] = cur->bc_ptrs[i]; - new->bc_ra[i] = cur->bc_ra[i]; - bp = cur->bc_bufs[i]; + new->bc_levels[i].ptr = cur->bc_levels[i].ptr; + new->bc_levels[i].ra = cur->bc_levels[i].ra; + bp = cur->bc_levels[i].bp; if (bp) { error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, xfs_buf_daddr(bp), mp->m_bsize,@@ -429,7 +429,7 @@ xfs_btree_dup_cursor( return error; } } - new->bc_bufs[i] = bp; + new->bc_levels[i].bp = bp; } *ncur = new; return 0;
ObHuh: that dup_cursor code seems like a really obtuse way of doing: bip = cur->bc_levels[i].bp->b_log_item; bip->bli_recur++; new->bc_levels[i] = cur->bc_levels[i]; But that's not a problem this patch needs to solve. Just something that made me go hmmmm...
quoted hunk ↗ jump to hunk
@@ -922,11 +922,11 @@ xfs_btree_readahead( (lev == cur->bc_nlevels - 1)) return 0; - if ((cur->bc_ra[lev] | lr) == cur->bc_ra[lev]) + if ((cur->bc_levels[lev].ra | lr) == cur->bc_levels[lev].ra) return 0;
That's whacky logic. Surely that's just: if (cur->bc_levels[lev].ra & lr) return 0;
quoted hunk ↗ jump to hunk
diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index 1018bcc43d66..f31f057bec9d 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h@@ -212,6 +212,19 @@ struct xfs_btree_cur_ino { #define XFS_BTCUR_BMBT_INVALID_OWNER (1 << 1) }; +struct xfs_btree_level { + /* buffer pointer */ + struct xfs_buf *bp; + + /* key/record number */ + uint16_t ptr; + + /* readahead info */ +#define XFS_BTCUR_LEFTRA 1 /* left sibling has been read-ahead */ +#define XFS_BTCUR_RIGHTRA 2 /* right sibling has been read-ahead */ + uint16_t ra; +};
The ra variable is a bit field. Can we define the values obviously as bit fields with (1 << 0) and (1 << 1) instead of 1 and 2?
quoted hunk ↗ jump to hunk
@@ -242,8 +250,17 @@ struct xfs_btree_cur struct xfs_btree_cur_ag bc_ag; struct xfs_btree_cur_ino bc_ino; }; + + /* Must be at the end of the struct! */ + struct xfs_btree_level bc_levels[]; }; +static inline size_t +xfs_btree_cur_sizeof(unsigned int nlevels) +{ + return struct_size((struct xfs_btree_cur *)NULL, bc_levels, nlevels); +}
Ooooh, yeah, we really need comments explaining how many btree levels these VLAs are tracking, because this one doesn't have a "- 1" in it like the previous one I commented on....
quoted hunk ↗ jump to hunk
diff --git a/fs/xfs/scrub/trace.c b/fs/xfs/scrub/trace.c index c0ef53fe6611..816dfc8e5a80 100644 --- a/fs/xfs/scrub/trace.c +++ b/fs/xfs/scrub/trace.c@@ -21,10 +21,11 @@ xchk_btree_cur_fsbno( struct xfs_btree_cur *cur, int level) { - if (level < cur->bc_nlevels && cur->bc_bufs[level]) + if (level < cur->bc_nlevels && cur->bc_levels[level].bp) return XFS_DADDR_TO_FSB(cur->bc_mp, - xfs_buf_daddr(cur->bc_bufs[level])); - if (level == cur->bc_nlevels - 1 && cur->bc_flags & XFS_BTREE_LONG_PTRS) + xfs_buf_daddr(cur->bc_levels[level].bp)); + else if (level == cur->bc_nlevels - 1 && + cur->bc_flags & XFS_BTREE_LONG_PTRS)
No need for an else there as the first if () clause returns. Also, needs more () around that "a & b" second line. Cheers, Dave. -- Dave Chinner david@fromorbit.com