Re: [PATCH] btrfs: make btrfs_dirty_inode() to always reserve metadata space
From: Qu Wenruo <hidden>
Date: 2021-01-12 12:31:11
On 2021/1/12 下午8:19, Nikolay Borisov wrote:
On 8.01.21 г. 7:36 ч., Qu Wenruo wrote:quoted
There are several qgroup flush related bugs fixed recently, all of them are caused by the fact that we can trigger qgroup metadata space reservation holding a transaction handle. Thankfully the only situation to trigger above reservation is btrfs_dirty_inode(). Currently btrfs_dirty_inode() will try join transactio first, then update the inode. If btrfs_update_inode() fails with -ENOSPC, then it retry to start transaction to reserve metadata space. This not only forces us to reserve metadata space with a transaction handle hold, but can't handle other errors like -EDQUOT. This patch will make btrfs_dirty_inode() to call btrfs_start_transaction() directly without first try joining then starting, so that in try_flush_qgroup() we won't hold a trans handle. This will slow down btrfs_dirty_inode() but my fstests doesn't show too much different for most test cases, thus it may be worthy to skip such performance "optimization".btrfs_dirty_inode is called everytime file_update_time is called or atime is updated. Given the default mount option is to use lazy a_time I'd say we needn't worry about it. file_update_time, however, is called within the write path so this change potentially makes file write more expensive. So instead of testing with fstests it needs proper benchmarking with fio.
Thanks for the benchmark tool. One of my biggest question is what tool should be used to properly benchmark the patch. Although I don't have extra physical machine, at least I can test with a VM with cache=none mode using a dedicated SSD for this case. Will update the patch to include the new benchmark result. Thanks, Qu
<snip>