Re: [PATCH v5 1/2] dax: Don't touch i_dio_count in dax_do_io()
From: Christoph Hellwig <hch@infradead.org>
Date: 2016-05-05 14:27:53
Also in:
lkml
From: Christoph Hellwig <hch@infradead.org>
Date: 2016-05-05 14:27:53
Also in:
lkml
On Thu, May 05, 2016 at 04:16:37PM +0200, Jan Kara wrote:
We cannot easily do this currently - the reason is that in several places we wait for i_dio_count to drop to 0 (look for inode_dio_wait()) while holding i_mutex to wait for all outstanding DIO / DAX IO. You'd break this logic with this patch. If we indeed put all writes under i_mutex, this problem would go away but as Dave explains in his email, we consciously do as much IO as we can without i_mutex to allow reasonable scalability of multiple writers into the same file.
So the above should be fine for xfs, but you're telling me that ext4 is doing DAX I/O without any inode lock at all? In that case it's indeed not going to work.
The downside of that is that overwrites and writes vs reads are not atomic wrt each other as POSIX requires. It has been that way for direct IO in XFS case for a long time, with DAX this non-conforming behavior is proliferating more. I agree that's not ideal but serializing all writes on a file is rather harsh for persistent memory as well...
For non-O_DIRECT I/O it's simply required..