Re: [PATCH] ext4: remove unnecessary ext4_inode_datasync_dirty in read path
From: Zhongwei Cai <hidden>
Date: 2021-11-10 08:08:21
On 11/9/21 12:50 PM, Dave Chinner wrote:
quoted
Could we add IOMAP_REPORT_DIRTY flag in the flags field of struct iomap_iter to indicate whether the IOMAP_F_DIRTY flag needs to be set or not?You can try. It might turn out OK, but you're also going to have to modify all the iomap code that current needs IOMAP_F_DIRTY to first set that flag, then change all the code that currently sets IOMAP_F_DIRTY to look at IOMAP_REPORT_DIRTY. i.e you now have to change iomap, ext4 and XFS to do this.
I will make a v2 patch with this implementation.
quoted
Currently the IOMAP_F_DIRTY flag is only checked in iomap_swapfile_activate(), dax_iomap_fault() and iomap_dio_rw() (To be more specific, only the write path in dax_iomap_fault() and iomap_dio_rw()). So it would be unnecessary to set the IOMAP_F_DIRTY flag in dax_iomap_rw() called in the previous tests.I think you're trying to optimise the wrong thing - the API is not the problem, the problem is that the journal->j_state_lock is contended and the ext4 dirty inode check needs to take it. Fix the dirty check not to need the journal state lock and the ext4 problem goes away and there is no need to change the iomap infrastructure.
I'll try to fix it inside ext4, although it seems difficult to do dirty check without journal->j_state_lock.
quoted
Other file systems that set the IOMAP_F_DIRTY flag efficiently could ignore the IOMAP_REPORT_DIRTY flag.No, that's just bad API design. If we are adding IOMAP_REPORT_DIRTY then the iomap infrastructure should only use that control flag when it needs to know if the inode is dirty. At this point, it really becomes mandatory for all filesystems using iomap to support it because the absence of IOMAP_F_DIRTY because a filesystem doesn't support it is not the same as "filesystem didn't set it because the inode is clean".
Perhaps I have not made it clear that by "ignore" I mean other file systems can set IOMAP_F_DIRTY regardless of whether the IOMAP_REPORT_DIRTY flag is set or not, just like what they are doing now. So we might not need to modify XFS. I think even without the modification I made, the ambiguity that the absence of IOMAP_F_DIRTY can either be file systems not supporting it or be actually "clean inode" may exist since we do not have a flag to indicate whether the file system supports setting IOMAP_F_DIRTY. Best, Zhongwei