Thread (14 messages) 14 messages, 4 authors, 2009-01-22

Re: [RFC] [PATCH] vfs: Call filesystem callback when backing device caches should be flushed

From: Jamie Lokier <hidden>
Date: 2009-01-21 21:47:50
Also in: linux-fsdevel

Jan Kara wrote:
On Tue 20-01-09 15:16:48, Joel Becker wrote:
quoted
On Tue, Jan 20, 2009 at 05:05:27PM +0100, Jan Kara wrote:
quoted
  we noted in our testing that ext2 (and it seems some other filesystems as
well) don't flush disk's write caches on cases like fsync() or changing
DIRSYNC directory. This is my attempt to solve the problem in a generic way
by calling a filesystem callback from VFS at appropriate place as Andrew
suggested. For ext2 what I did is enough (it just then fills in
block_flush_device() as .flush_device callback) and I think it could be
fine for other filesystems as well.
	The only question I have is why this would be optional.  It
would seem that this would be the preferred default behavior for all
block filesystems.  We have the backing_dev_info and a way to override
the default if a filesystem needs something special.
  The reason why I've decided for NOP to be the default is that
filesystems doing proper journalling with barriers should not need
this (as the barrier in the transaction commit already does the job
for them).
No, that doesn't work.

fsync() doesn't always cause a transaction.  If there's no inode
change, there may not be a transaction.  Writing does not always dirty
mtime, if it's within mtime granularity.

For efficient fdatasync() you _never_ want a transaction if possible,
because it forces the disk head to seek between alternating regions of
the disk, two seeks per fsync().

So you can't rely on journalling transactions to flush.
  Finally, I prefer maintainers of the filesystems themselves to decide
whether their filesystem needs flushing and thus knowingly impose this
performance penalty on them...
I say it should flush be default unless a filesystem hooks an
alternative strategy.  Certainly, it's silly to have the same code
duplicated in nearly every filesystem

-- Jamie
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help