Re: [PATCH, RFC] writeback: avoid redirtying when ->write_inode failed to clear I_DIRTY
From: Wu Fengguang <hidden>
Date: 2011-08-27 13:58:29
Also in:
linux-fsdevel
Christoph, On Sat, Aug 27, 2011 at 02:14:09PM +0800, Christoph Hellwig wrote:
Right now ->write_inode has no way to safely return a EAGAIN without explicitly redirtying the inode, as we would lose the dirty state otherwise. Most filesystems get this wrong, but XFS makes heavy use of it to avoid blocking the flusher thread when ->write_inode hits contentended inode locks. A contended ilock is something XFS can hit very easibly when extending files, as the data I/O completion handler takes the lock to update the size, and the ->write_inode call can race with it fairly easily if writing enough data in one go so that the completion for the first write come in just before we call ->write_inode. Change the handling of this case to use requeue_io for a quick retry instead of redirty_tail, which keeps moving out the dirtied_when data and thus keeps delaying the writeout more and more with every failed attempt to get the lock.
Yeah redirty_tail() does have the problem of possibly delay inodes for too long time. However, you know requeue_io() always has the danger of triggering busy wb_writeback() loops in corner cases. For example, nfs_write_inode()/nfs_commit_unstable_pages() often redirty the inode without doing anything (hence no any progress, a prerequisite for busy loops) depending on the in flight writes, which unfortunately further depends on _external_ network/server states.. That means some stalled network/sever state could lead to busy loops in NFS clients. The alternative solution may be to firstly apply the attached patch, and change this one to: - redirty_tail(inode, wb); + requeue_io_wait(inode, wb); Thanks, Fengguang
quoted hunk ↗ jump to hunk
Signed-off-by: Christoph Hellwig <hch@lst.de> Index: linux-2.6/fs/fs-writeback.c ===================================================================--- linux-2.6.orig/fs/fs-writeback.c 2011-08-26 14:47:42.137050059 +0200 +++ linux-2.6/fs/fs-writeback.c 2011-08-26 15:06:47.003493601 +0200@@ -464,8 +464,18 @@ writeback_single_inode(struct inode *ino * operations, such as delayed allocation during * submission or metadata updates after data IO * completion. + * + * For the latter case it is very important to give + * the inode another turn on b_more_io instead of + * redirtying it. Constantly moving dirtied_when + * forward will prevent us from ever writing out + * the metadata dirtied in the I/O completion handler. + * + * For files on XFS that constantly get appended to + * calling redirty_tail means they will never get + * their updated i_size written out. */ - redirty_tail(inode, wb); + requeue_io(inode, wb); } else { /* * The inode is clean. At this point we either have
Attachments
- writeback-more_io_wait.patch [text/x-diff] 5015 bytes · preview
- (unnamed) [text/plain] 121 bytes · preview