Thread (15 messages) 15 messages, 3 authors, 2012-03-19

Re: [PATCH 4/4] writeback: Avoid iput() from flusher thread

From: Christoph Hellwig <hch@infradead.org>
Date: 2012-03-19 08:55:15
Also in: linux-fsdevel, lkml

On Fri, Mar 09, 2012 at 10:02:28AM +0100, Jan Kara wrote:
quoted hunk ↗ jump to hunk
Doing iput() from flusher thread (writeback_sb_inodes()) can create problems
because iput() can do a lot of work - for example truncate the inode if it's
the last iput on unlinked file. Some filesystems (e.g. ubifs) may need to
allocate blocks during truncate (due to their COW nature) and in some cases
they thus need to flush dirty data from truncate to reduce uncertainty in the
amount of free space. This effectively creates a deadlock.

We get rid of iput() in flusher thread by using the fact that I_SYNC inode
flag effectively pins the inode in memory. So if we take care to either hold
i_lock or have I_SYNC set, we can get away without taking inode reference
in writeback_sb_inodes().

As a side effect, we also fix possible use-after-free in wb_writeback() because
inode_wait_for_writeback() call could try to reacquire i_lock on the inode that
was already free.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/fs-writeback.c         |   38 ++++++++++++++++++++++++--------------
 fs/inode.c                |   11 ++++++++++-
 include/linux/fs.h        |    7 ++++---
 include/linux/writeback.h |    7 +------
 4 files changed, 39 insertions(+), 24 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 1e8bf44..f9f9b61 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -325,19 +325,21 @@ static int write_inode(struct inode *inode, struct writeback_control *wbc)
 }
 
 /*
- * Wait for writeback on an inode to complete.
+ * Wait for writeback on an inode to complete. Called with i_lock held.
+ * Return 1 if we dropped i_lock and waited, 0 is returned otherwise.
  */
-static void inode_wait_for_writeback(struct inode *inode)
+int __must_check inode_wait_for_writeback(struct inode *inode)
 {
 	DEFINE_WAIT_BIT(wq, &inode->i_state, __I_SYNC);
 	wait_queue_head_t *wqh;
 
 	wqh = bit_waitqueue(&inode->i_state, __I_SYNC);
+	if (inode->i_state & I_SYNC) {
 		spin_unlock(&inode->i_lock);
 		__wait_on_bit(wqh, &wq, inode_wait, TASK_UNINTERRUPTIBLE);
+		return 1;
 	}
+	return 0;
This is a horribly ugl primitive.

I'd rather add a

void inode_wait_for_writeback(struct inode *inode)
{
 	DEFINE_WAIT_BIT(wq, &inode->i_state, __I_SYNC);
 	wait_queue_head_t *wqh = bit_waitqueue(&inode->i_state, __I_SYNC);

	__wait_on_bit(wqh, &wq, inode_wait, TASK_UNINTERRUPTIBLE);
}

and opencode all the locking ad I_SYNC checking logic in the callers.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help