Re: [PATCH -v2] ext4: optimize ext4_should_retry_alloc() to improve ENOSPC performance
From: Jan Kara <jack@suse.cz>
Date: 2016-06-16 19:44:46
On Tue 07-06-16 22:46:46, Ted Tso wrote:
If there are pending blocks to be released after a commit, retrying the allocation after a journal commit has no hope of helping. So track how many pending deleted blocks there might be, and don't retry if there are no pending blocks. Reported-by: Chao Yu <redacted> Signed-off-by: Theodore Ts'o <tytso@mit.edu> --- Oops, ignore the earlier version of this patch. I bobbled the commit and merged in part of another change.
Couple of notes below:
quoted hunk ↗ jump to hunk
fs/ext4/balloc.c | 9 ++++++++- fs/ext4/ext4.h | 1 + fs/ext4/ext4_jbd2.h | 10 +++++++++- fs/ext4/mballoc.c | 12 ++++++++++-- 4 files changed, 28 insertions(+), 4 deletions(-)diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c index 3020fd7..371ac63 100644 --- a/fs/ext4/balloc.c +++ b/fs/ext4/balloc.c@@ -603,7 +603,14 @@ int ext4_claim_free_clusters(struct ext4_sb_info *sbi, */ int ext4_should_retry_alloc(struct super_block *sb, int *retries) { - if (!ext4_has_free_clusters(EXT4_SB(sb), 1, 0) || + unsigned int pending_blocks; + + spin_lock(&EXT4_SB(sb)->s_md_lock); + pending_blocks = EXT4_SB(sb)->s_mb_free_pending; + spin_unlock(&EXT4_SB(sb)->s_md_lock); + + if (pending_blocks == 0 || + !ext4_has_free_clusters(EXT4_SB(sb), 1, 0) || (*retries)++ > 3 || !EXT4_SB(sb)->s_journal) return 0;
But this is racy. Transaction commit could have finished before we called ext4_should_retry_alloc() and so we will mistakenly think there's no hope although there are blocks free now. But what you could probably do is just return 1 without forcing a transaction commit when pending_blocks == 0.
quoted hunk ↗ jump to hunk
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index b84aa1c..96c73e6 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h@@ -1430,6 +1430,7 @@ struct ext4_sb_info { unsigned short *s_mb_offsets; unsigned int *s_mb_maxs; unsigned int s_group_info_size; + unsigned int s_mb_free_pending; /* tunables */ unsigned long s_stripe;diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h index 09c1ef3..b1d52c1 100644 --- a/fs/ext4/ext4_jbd2.h +++ b/fs/ext4/ext4_jbd2.h@@ -175,6 +175,13 @@ struct ext4_journal_cb_entry { * There is no guaranteed calling order of multiple registered callbacks on * the same transaction. */ +static inline void _ext4_journal_callback_add(handle_t *handle, + struct ext4_journal_cb_entry *jce) +{ + /* Add the jce to transaction's private list */ + list_add_tail(&jce->jce_list, &handle->h_transaction->t_private_list); +} + static inline void ext4_journal_callback_add(handle_t *handle, void (*func)(struct super_block *sb, struct ext4_journal_cb_entry *jce,
Well, since ext4_mb_free_metadata() is the only user of ext4_journal_callback_add(), ext4_journal_callback_add() won't have any user after your patch. Maybe we could just stop playing these abstraction games nobody currently uses and just implement a helper function to add freeing callback to the transaction list including increment of the pending counter. Honza -- Jan Kara [off-list ref] SUSE Labs, CR