Thread (18 messages) 18 messages, 7 authors, 2017-08-21

Re: [RFC PATCH 0/7] dax, ext4: Synchronous page faults

From: Jan Kara <jack@suse.cz>
Date: 2017-08-17 16:16:32
Also in: linux-fsdevel, linux-xfs, nvdimm

On Mon 14-08-17 17:04:17, Boaz Harrosh wrote:
Thank you Jan, I'm patiently waiting for this MAP_SYNC flag since I asked for
it in 2014. I'm so glad its time is finally do.

Thank you for working on this. Please CC me on future patches.
(note the new Netapp email)

On 13/08/17 12:25, Christoph Hellwig wrote:
quoted
On Sat, Aug 12, 2017 at 07:44:14PM -0700, Dan Williams wrote:
quoted
How about MAP_SYNC == (MAP_SHARED|MAP_PRIVATE)? On older kernels that
should get -EINVAL, and on new kernels it means SYNC+SHARED.
Cute trick, but I'd hate to waster it just for our little flag.

How about:

#define __MAP_VALIDATE		MAP_SHARED|MAP_PRIVATE
#define MAP_SYNC		0x??? | __MAP_VALIDATE

so that we can reuse that trick for any new flag?
YES! And please create a mask for all new flags and in validation
code if ((m_flags & __MAP_VALIDATE) == __MAP_VALIDATE) then you
want that (m_flags & __MAP_NEWFLAGS) does not come empty, this
way you actually preserve the old check that SHARED and PRIVATE
do not co exist.
For now I did just a crude hack. Dan is working on new mmap syscall which
checks flags which will be cleaner...
Few Comments on this new MAP_ flag

0] The name at least needs to be MAP_MSYNC because only meta-data is
    synced not the data pointed to. That is the responsibility of the app
So we actually do normal fdatasync() call so we do flush data as well. This
way we don't have to be afraid of stale data exposure or other strange
effects. So I've kept the name to be MAP_SYNC.
1] This flag you have named MAP_SYNC but it is very much related to
   dax and the ability for user-mode to "flush" the data pointed by this
   now "synced" meta data.
   For example in ext4, this flag set on an inode that is *not* IS_DAX
   should fail the mmap. Because there is no point of synced meta if the
   data is actually in page-cache and we know for sure it was not yet synced,
   And there is no way for user-mode to directly "sync" the data as well.
Yes, done.
2] The code should be constructed that the default check for the MAP_SYNC
   should fail, and only Hopped in FSs are allowed.
   (So not to modify all Implementations of file_operations->mmap() )
Agreed but for now I've skipped this as I wait for new mmap syscall and
how Dan implements flag checking there.
3] /dev/pmem could start serving DAX pages in mmap, if asked for MAP_MSYNC
   (which is also an API that says "I know I need to cl_flush". See 1. )
MAP_SYNC is rather more like: I can also use clflush instead of
fdatasync(2). And this is rather important as all legacy applications are
100% safe in the new scheme.
 
4] Once we have this flag. And properly implemented at least in one FS
   and optionally in /dev/pmemX we no longer have any justification for
   /dev/daxX and it can die a slow and happy death.
This will be more complex I guess - see MAP_DIRECT proposal...

								Honza
-- 
Jan Kara [off-list ref]
SUSE Labs, CR
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help