Thread (8 messages) 8 messages, 4 authors, 2016-05-18

Re: [PATCH] block: Fix S_DAX inode flag locking

From: Dave Chinner <david@fromorbit.com>
Date: 2016-05-18 14:32:12
Also in: linux-fsdevel

On Wed, May 18, 2016 at 10:20:39AM +0200, Jan Kara wrote:
On Wed 18-05-16 08:58:42, Dave Chinner wrote:
quoted
On Tue, May 17, 2016 at 12:34:57PM -0700, Dan Williams wrote:
quoted
On Tue, May 17, 2016 at 11:29 AM, Jon Derrick
[off-list ref] wrote:
quoted
This patch fixes S_DAX bd_inode i_flag locking to conform to suggested
A "fix" implies that its currently broken.  I don't see how it is, not
until we add an ioctl method or other path that also tries to update
the flags outside of blkdev_get() context.  So, I don't think this
patch stands on its own if you were intending it to be merged
separately.
quoted
locking rules. It is presumed that S_DAX is the only valid inode flag
for a block device which subscribes to direct-access, and will restore
any previously set flags if direct-access initialization fails.

This reverts to i_flags behavior prior to
bbab37ddc20bae4709bca8745c128c4f46fe63c5
by allowing other bd_inode flags when DAX is disabled

Signed-off-by: Jon Derrick <redacted>
---
 fs/block_dev.c | 31 ++++++++++++++++++++++++++-----
 1 file changed, 26 insertions(+), 5 deletions(-)
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 20a2c02..d41e37f 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1159,6 +1159,20 @@ void bd_set_size(struct block_device *bdev, loff_t size)
 }
 EXPORT_SYMBOL(bd_set_size);

+static void bd_add_dax(struct inode *inode)
+{
+       inode_lock(inode);
+       inode->i_flags |= S_DAX;
+       inode_unlock(inode);
+}
+
+static void bd_clear_dax(struct inode *inode)
+{
+       inode_lock(inode);
+       inode->i_flags &= ~S_DAX;
+       inode_unlock(inode);
+}
Since this is inode generic should these helpers be prefixed "i_"
rather than "bd_"?
Probably not, because in general filesystems are responsible for
updating i_flags to reflect on-disk inode configuration and that's
typically done under transaction contexts. e.g.  through ioctl
interfaces to set/clear flags that are stored on disk.  As such,
inode->i_flags is effectively protected by the filesystem specific
locking heirarchy, not the generic inode_lock().

e.g. have a look at XFS storing a persistent "DAX-enabled" flag in
the inode, which can be set/cleared on individual inodes dynamically
by FS_IOC_FSSETXATTR. The XFS i_flags update function assumes
exclusive access to the field as it is called under locked
transaction context. Similar code exists in ext4, btrfs, gfs2,
etc....
So in case of ext4, we actually do use inode_lock() to protect against
racing IOC_SETFLAGS calls. i_flags is a strange mix and there are a few
(like S_DEAD or S_NOSEC) which are not persistent and those get set /
cleared by VFS. Some of those places (e.g. the clearing in
__vfs_setxattr_noperm()) are not really controlled by the filesystem
AFAICT. So when XFS doesn't use inode_lock() to protect i_flags updates it
can race with VFS on i_flags updates.
There are several other filesystems with the same problem. It's not
actually clear what they rules for i_flags are. The comment above
inode_set_flags() is ambiguous at best, and even points this out:

 * In the long run, i_mutex is overkill, and we should probably look
 * at using the i_lock spinlock to protect i_flags, and then make sure
 * it is so documented in include/linux/fs.h and that all code follows
 * the locking convention!!

Perhaps that should be done before the situation gets worse...

Cheers

Dave.
-- 
Dave Chinner
david@fromorbit.com
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help