Thread (22 messages) 22 messages, 5 authors, 2016-02-09

Re: [PATCH 2/2] dax: move writeback calls into the filesystems

From: Ross Zwisler <hidden>
Date: 2016-02-09 18:06:47
Also in: linux-fsdevel, linux-mm, linux-xfs, lkml, nvdimm

On Tue, Feb 09, 2016 at 05:01:34PM +0100, Jan Kara wrote:
On Tue 09-02-16 10:43:53, Jan Kara wrote:
quoted
On Mon 08-02-16 12:55:24, Dan Williams wrote:
quoted
On Mon, Feb 8, 2016 at 12:18 PM, Dave Chinner [off-list ref] wrote:
[..]
quoted
quoted
Setting aside the current block zeroing problem you seem to assuming
that DAX will always be faster and that may not be true at a media
level.  Waiting years for some applications to determine if DAX makes
sense for their use case seems completely reasonable.  In the meantime
the apps that are already making these changes want to know that a DAX
mapping request has not silently dropped backed to page cache.  They
also want to know if they successfully jumped through all the hoops to
get a larger than pte mapping.

I agree it is useful to be able to force DAX on an unmodified
application to see what happens, and it follows that if those
applications want to run in that mode they will need functional
fsync()...

I would feel better if we were talking about specific applications and
performance numbers to know if forcing DAX on application is a debug
facility or a production level capability.  You seem to have already
made that determination and I'm curious what I'm missing.
I'm not setting any policy here at all.  This whole argument is
based around the DAX mount option doing "global fs enable or
silently turning it off" and the application not knowing about that.

The whole point of having a persistent per-inode DAX flags is that
it is a policy mechanism, not a policy.  The application can, if it
is DAX aware, directly control whether DAX is used on a file or not.
The application can even query and clear that persistent inode flag
if it is configured not to (or cannot) use DAX.

If the filesystem cannot support DAX, then we can error out attempts
to set the DAX flag and then the app knows DAX is not available.
i.e. the attempt to set policy failed. If the flag is set, then the
inode will *always* use DAX - there is no "fall back to page cache"
when DAX is enabled.

If the applicaiton is not DAX aware, then the admin can control the
DAX policy by manipulating these flags themselves, and hence control
whether DAX is used by the application or not.

If you think I'm dictating policy for DAX users and application,
then you haven't understood anything I've previously said about why
the DAX mount option needs to die before any of this is considered
production ready. DAX is not an opaque "all or nothing" option. XFS
will provide apps and admins with fine-grained, persistent,
discoverable policy flags to allow admins and applications to set
DAX policies however they see fit. This simply cannot be done if the
only knob you have is a mount option that may or may not stick.
I agree the mount option needs to die, and I fully grok the reasoning.
  What I'm concerned with is that a system using fully-DAX-aware
applications is forced to incur the overhead of maintaining *sync
semantics, periodic sync(2) in particular,  even if it is not relying
on those semantics.
Let me somewhat correct this: IMO hard requirement is maintaining sync(2)
semantics. Periodic writeback does not have any hard durability guarantees
and we are free to ignore such requests in ->writepages() (that function
has enough information in the writeback_control structure to differentiate
between periodic writeback and data integrity sync) if we decide it is
useful. Actually, we could do that even for 4.5.
Attached is a version of Ross' patch that will work for sync(2) and
fsync(2) and we won't flush caches during periodic writeback. The patch is
only compile-tested. Ross?
This looks great.  I'll send out a v2 with this and with the
dax_clear_sectors() changes after I'm done testing.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help