Re: [PATCH 3/3] xfs: introduce per-inode DAX enablement
From: Dan Williams <hidden>
Date: 2016-01-21 22:53:08
Also in:
linux-fsdevel
On Thu, Jan 21, 2016 at 1:58 PM, Dave Chinner [off-list ref] wrote:
On Thu, Jan 21, 2016 at 08:37:11AM -0800, Dan Williams wrote:quoted
On Sun, Jan 3, 2016 at 9:54 PM, Dave Chinner [off-list ref] wrote:quoted
From: Dave Chinner <redacted> Rather than just being able to turn DAX on and off via a mount option, some applications may only want to enable DAX for certain performance critical files in a filesystem. This patch introduces a new inode flag to enable DAX in the v3 inode di_flags2 field. It adds support for setting and clearing flags in the di_flags2 field via the XFS_IOC_FSSETXATTR ioctl, and sets the S_DAX inode flag appropriately when it is seen. When this flag is set on a directory, it acts as an "inherit flag". That is, inodes created in the directory will automatically inherit the on-disk inode DAX flag, enabling administrators to set up directory heirarchies that automatically use DAX. Setting this flag on an empty root directory will make the entire filesystem use DAX by default.When switching from page-cache to DAX, don't we need to flush existing page cache mappings and remap directly? Or, is the thought that userspace needs to comprehend the presence of mixed mappings after changing S_DAX?The change should be transparent to userspace. In general, I don't expect users to change the behaviour of files that are in active use (why would you do that?).
If by accident someone tries to dynamically change S_DAX while existing mappings are established I think the kernel should just return EBUSY. I was not proposing we support it as a first-class operation.
This patch is really just introducing the flag, the userspace API and making it propagate correctly via the on-disk format. We'll fix up whatever problems with switching it on/off dynamically as we go, like we do with most experimental features once the on-disk behaviour is sorted out.
Ok.
i.e. I've already got a couple of fixes we need to add to this - the DAX flag is only valid on CRC enabled filesystems,
I assume for torn-write protection? The CRC limitation makes sense, but we theoretically could get the same effect by using a separate logdev that does not tear writes, right?
so we need to check that in the ioctl (general problem with using di_flags2 field, not DAX flag specific issue). Adding a code to sync and unmap when changing the flag is probably also necessary in the ioctl - I don't have code to do that yet, but I have been thinking about it...
Matthew and I have also talked about a modification of mincore(2) to interrogate the effective mapping mode. It seems we'll need that or something like it given the growing list of caveats with setting up a DAX mapping. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs