Thread (33 messages) 33 messages, 6 authors, 2008-08-21

Re: [PATCH] jbd jbd2: fix dio write returning EIO whentry_to_release_page fails

From: Chris Mason <hidden>
Date: 2008-08-06 13:25:13
Also in: linux-fsdevel

On Tue, 2008-08-05 at 14:17 -0700, Mingming Cao wrote:
在 2008-08-05二的 12:17 -0400,Chris Mason写道:
quoted
On Tue, 2008-08-05 at 13:51 +0900, Hisashi Hifumi wrote:
quoted
quoted
quoted
quoted
diff -Nrup linux-2.6.27-rc1.org/fs/jbd/transaction.c 
linux-2.6.27-rc1/fs/jbd/transaction.c
quoted
quoted
--- linux-2.6.27-rc1.org/fs/jbd/transaction.c	2008-07-29 
19:28:47.000000000 +0900
quoted
quoted
+++ linux-2.6.27-rc1/fs/jbd/transaction.c	2008-07-29 20:40:12.000000000 +0900
@@ -1764,6 +1764,12 @@ int journal_try_to_free_buffers(journal_
 	*/
 	if (ret == 0 && (gfp_mask & __GFP_WAIT) && (gfp_mask & __GFP_FS)) {
 		journal_wait_for_transaction_sync_data(journal);
+
+		bh = head;
+		do {
+			while (atomic_read(&bh->b_count))
+				schedule();
+		} while ((bh = bh->b_this_page) != head);
 		ret = try_to_free_buffers(page);
 	}
The loop is problematic.  If the scheduler decides to keep running this
task then we have a busy loop.  If this task has realtime policy then
it might even lock up the kernel.
ocfs2 calls journal_try_to_free_buffers too, looping on b_count might
not be the best idea there either.

This code gets called from releasepage, which is used other places than
the O_DIRECT invalidation paths, I'd be worried about performance
problems here.
try_to_release_page has gfp_mask parameter. So when try_to_releasepage
is called from performance sensitive part, gfp_mask should not be set.
b_count check loop is inside of (gfp_mask & __GFP_WAIT) && (gfp_mask & __GFP_FS) check.
Looks like try_to_free_pages will go into releasepage with wait & fs
both set.  This kind of change would make me very nervous.
Hi Chris,

The gfp_mask try_to_free_pages() takes from it's caller will past it
down to try_to_release_page().  Based on the meaning of __GFP_WAIT and
GFP_FS, if the upper level caller set these two flags,  I assume the
upper level caller expect delay and wait for fs to finish?


But I agree that using a loop in journal_try_to_free_buffers() to wait
for the busy bh release the counter is expensive...
I rediscovered your old thread about trying to do this in a launder_page
call ;)

Does it make more sense to fix do_launder_page to call into the FS on
every page, and let the FS check for PageDirty on its own?  That way
invalidate_inode_pages2_range basically gets its own private call into
the FS that says wait around until this page is really free.

-chris


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help