Thread (25 messages) 25 messages, 5 authors, 2016-05-08

Re: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io

From: Dan Williams <hidden>
Date: 2016-05-05 15:15:32
Also in: linux-ext4, linux-fsdevel, linux-mm, linux-xfs, lkml

On Thu, May 5, 2016 at 7:24 AM, Christoph Hellwig [off-list ref] wrote:
On Mon, May 02, 2016 at 06:41:51PM +0300, Boaz Harrosh wrote:
quoted
quoted
All IO in a dax filesystem used to go through dax_do_io, which cannot
handle media errors, and thus cannot provide a recovery path that can
send a write through the driver to clear errors.

Add a new iocb flag for DAX, and set it only for DAX mounts. In the IO
path for DAX filesystems, use the same direct_IO path for both DAX and
direct_io iocbs, but use the flags to identify when we are in O_DIRECT
mode vs non O_DIRECT with DAX, and for O_DIRECT, use the conventional
direct_IO path instead of DAX.
Really? What are your thinking here?

What about all the current users of O_DIRECT, you have just made them
4 times slower and "less concurrent*" then "buffred io" users. Since
direct_IO path will queue an IO request and all.
(And if it is not so slow then why do we need dax_do_io at all? [Rhetorical])

I hate it that you overload the semantics of a known and expected
O_DIRECT flag, for special pmem quirks. This is an incompatible
and unrelated overload of the semantics of O_DIRECT.
Agreed - makig O_DIRECT less direct than not having it is plain stupid,
and I somehow missed this initially.
Of course I disagree because like Dave argues in the msync case we
should do the correct thing first and make it fast later, but also
like Dave this arguing in circles is getting tiresome.
This whole DAX story turns into a major nightmare, and I fear all our
hodge podge tweaks to the semantics aren't helping it.

It seems like we simply need an explicit O_DAX for the read/write
bypass if can't sort out the semantics (error, writer synchronization)
just as we need a special flag for MMAP.
I don't see how O_DAX makes this situation better if the goal is to
accelerate unmodified applications...

Vishal, at least the "delete a file with a badblock" model will still
work for implicitly clearing errors with your changes to stop doing
block clearing in fs/dax.c.  This combined with a new -EBADBLOCK (as
Dave suggests) and explicit logging of I/Os that fail for this reason
at least gives a chance to communicate errors in files to suitably
aware applications / environments.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help