Thread (30 messages) 30 messages, 9 authors, 2016-08-09

Re: Subtle races between DAX mmap fault and write path

From: Dan Williams <hidden>
Date: 2016-07-29 14:44:26
Also in: linux-fsdevel, linux-xfs, nvdimm

On Thu, Jul 28, 2016 at 7:21 PM, Dave Chinner [off-list ref] wrote:
On Thu, Jul 28, 2016 at 10:10:33AM +0200, Jan Kara wrote:
quoted
On Thu 28-07-16 08:19:49, Dave Chinner wrote:
[..]
quoted
So DAX doesn't need flushing to maintain consistent view of the data but it
does need flushing to make sure fsync(2) results in data written via mmap
to reach persistent storage.
I thought this all changed with the removal of the pcommit
instruction and wmb_pmem() going away.  Isn't it now a platform
requirement now that dirty cache lines over persistent memory ranges
are either guaranteed to be flushed to persistent storage on power
fail or when required by REQ_FLUSH?
No, nothing automates cache flushing.  The path of a write is:

cpu-cache -> cpu-write-buffer -> bus -> imc -> imc-write-buffer -> media

The ADR mechanism and the wpq-flush facility flush data thorough the
imc (integrated memory controller) to media.  dax_do_io() gets writes
to the imc, but we still need a posted-write-buffer flush mechanism to
guarantee data makes it out to media.

https://lkml.org/lkml/2016/7/9/131

And part of that is the wmb_pmem() calls are going away?

https://lkml.org/lkml/2016/7/9/136
https://lkml.org/lkml/2016/7/9/140

i.e. fsync on pmem only needs to take care of writing filesystem
metadata now, and the pmem driver handles the rest when it gets a
REQ_FLUSH bio from fsync?

https://lkml.org/lkml/2016/7/9/134

Or have we somehow ended up with the fucked up situation where
dax_do_io() writes are (effectively) immediately persistent and
untracked by internal infrastructure, whilst mmap() writes
require internal dirty tracking and fsync() to flush caches via
writeback?
dax_do_io() writes are not immediately persistent.  They bypass the
cpu-cache and cpu-write-bufffer and are ready to be flushed to media
by REQ_FLUSH or power-fail on an ADR system.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help