Re: [PATCH v3 2/5] btrfs: change error handling for btrfs_delete_*_in_log
From: Filipe Manana <hidden>
Date: 2021-09-28 09:54:28
On Mon, Sep 27, 2021 at 7:06 PM Josef Bacik [off-list ref] wrote:
Currently we will abort the transaction if we get a random error (like -EIO) while trying to remove the directory entries from the root log during rename. However the log sync stuff doesn't actually honor the transaction abort flag, it'll happily commit the log even if we've aborted the transaction for reasons related to the log, which is a pretty bad problem.
I'm not seeing how that would happen.
Since your commit 165ea85f14831f ("btrfs: do not write supers if we
have an fs error"), log syncing actually checks if a transaction was
aborted, and skips the log commit.
We're also aborting the transaction while holding the log pinned.
So I don't see how we can end up committing an inconsistent log if any
of the calls to remove dirents from the log fails.
Maybe this patch was prepared before that other patch?
Either way it seems the change log needs to be updated, because log
syncing checks for transaction aborts / fs error state.
Fix this by making these functions void, as we don't actually care if this fails. Instead mark the log as requiring a full commit on error.
This makes sense, there's really no need to abort a transaction, forcing the next log commit attempt to fallback to a transaction commit is more than enough.
This will keep us from committing this bad log, and if we fsync we'll force a full transaction commit and have a consistent file system.
Codewise it looks good, I just don't think the part of committing a bad log can happen after that other commit. Thanks.
quoted hunk ↗ jump to hunk
Signed-off-by: Josef Bacik <josef@toxicpanda.com> --- fs/btrfs/inode.c | 16 +++------------- fs/btrfs/tree-log.c | 41 ++++++++++++++--------------------------- fs/btrfs/tree-log.h | 16 ++++++++-------- 3 files changed, 25 insertions(+), 48 deletions(-)diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index a643be30c18a..03a843b9659b 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c@@ -4108,19 +4108,9 @@ static int __btrfs_unlink_inode(struct btrfs_trans_handle *trans, goto err; } - ret = btrfs_del_inode_ref_in_log(trans, root, name, name_len, inode, - dir_ino); - if (ret != 0 && ret != -ENOENT) { - btrfs_abort_transaction(trans, ret); - goto err; - } - - ret = btrfs_del_dir_entries_in_log(trans, root, name, name_len, dir, - index); - if (ret == -ENOENT) - ret = 0; - else if (ret) - btrfs_abort_transaction(trans, ret); + btrfs_del_inode_ref_in_log(trans, root, name, name_len, inode, + dir_ino); + btrfs_del_dir_entries_in_log(trans, root, name, name_len, dir, index); /* * If we have a pending delayed iput we could end up with the final iputdiff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index e0c2d4c7f939..720723611875 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c@@ -3500,10 +3500,10 @@ static bool inode_logged(struct btrfs_trans_handle *trans, * This optimizations allows us to avoid relogging the entire inode * or the entire directory. */ -int btrfs_del_dir_entries_in_log(struct btrfs_trans_handle *trans, - struct btrfs_root *root, - const char *name, int name_len, - struct btrfs_inode *dir, u64 index) +void btrfs_del_dir_entries_in_log(struct btrfs_trans_handle *trans, + struct btrfs_root *root, + const char *name, int name_len, + struct btrfs_inode *dir, u64 index) { struct btrfs_root *log; struct btrfs_dir_item *di;@@ -3513,11 +3513,11 @@ int btrfs_del_dir_entries_in_log(struct btrfs_trans_handle *trans, u64 dir_ino = btrfs_ino(dir); if (!inode_logged(trans, dir)) - return 0; + return; ret = join_running_log_trans(root); if (ret) - return 0; + return; mutex_lock(&dir->log_mutex);@@ -3565,49 +3565,36 @@ int btrfs_del_dir_entries_in_log(struct btrfs_trans_handle *trans, btrfs_free_path(path); out_unlock: mutex_unlock(&dir->log_mutex); - if (err == -ENOSPC) { + if (err < 0 && err != -ENOENT) btrfs_set_log_full_commit(trans); - err = 0; - } else if (err < 0 && err != -ENOENT) { - /* ENOENT can be returned if the entry hasn't been fsynced yet */ - btrfs_abort_transaction(trans, err); - } - btrfs_end_log_trans(root); - - return err; } /* see comments for btrfs_del_dir_entries_in_log */ -int btrfs_del_inode_ref_in_log(struct btrfs_trans_handle *trans, - struct btrfs_root *root, - const char *name, int name_len, - struct btrfs_inode *inode, u64 dirid) +void btrfs_del_inode_ref_in_log(struct btrfs_trans_handle *trans, + struct btrfs_root *root, + const char *name, int name_len, + struct btrfs_inode *inode, u64 dirid) { struct btrfs_root *log; u64 index; int ret; if (!inode_logged(trans, inode)) - return 0; + return; ret = join_running_log_trans(root); if (ret) - return 0; + return; log = root->log_root; mutex_lock(&inode->log_mutex); ret = btrfs_del_inode_ref(trans, log, name, name_len, btrfs_ino(inode), dirid, &index); mutex_unlock(&inode->log_mutex); - if (ret == -ENOSPC) { + if (ret < 0 && ret != -ENOENT) btrfs_set_log_full_commit(trans); - ret = 0; - } else if (ret < 0 && ret != -ENOENT) - btrfs_abort_transaction(trans, ret); btrfs_end_log_trans(root); - - return ret; } /*diff --git a/fs/btrfs/tree-log.h b/fs/btrfs/tree-log.h index 3ce6bdb76009..f6811c3df38a 100644 --- a/fs/btrfs/tree-log.h +++ b/fs/btrfs/tree-log.h@@ -70,14 +70,14 @@ int btrfs_recover_log_trees(struct btrfs_root *tree_root); int btrfs_log_dentry_safe(struct btrfs_trans_handle *trans, struct dentry *dentry, struct btrfs_log_ctx *ctx); -int btrfs_del_dir_entries_in_log(struct btrfs_trans_handle *trans, - struct btrfs_root *root, - const char *name, int name_len, - struct btrfs_inode *dir, u64 index); -int btrfs_del_inode_ref_in_log(struct btrfs_trans_handle *trans, - struct btrfs_root *root, - const char *name, int name_len, - struct btrfs_inode *inode, u64 dirid); +void btrfs_del_dir_entries_in_log(struct btrfs_trans_handle *trans, + struct btrfs_root *root, + const char *name, int name_len, + struct btrfs_inode *dir, u64 index); +void btrfs_del_inode_ref_in_log(struct btrfs_trans_handle *trans, + struct btrfs_root *root, + const char *name, int name_len, + struct btrfs_inode *inode, u64 dirid); void btrfs_end_log_trans(struct btrfs_root *root); void btrfs_pin_log_trans(struct btrfs_root *root); void btrfs_record_unlink_dir(struct btrfs_trans_handle *trans, --2.26.3
-- Filipe David Manana, “Whether you think you can, or you think you can't — you're right.”