Re: [PATCH] ext4: optimize ext4 direct I/O locking for reading
From: Jan Kara <jack@suse.cz>
Date: 2016-09-22 12:32:36
Also in:
linux-fsdevel
On Wed 21-09-16 10:37:48, Ted Tso wrote:
On Wed, Sep 21, 2016 at 06:26:09AM -0700, Christoph Hellwig wrote:quoted
Is there any chance you could look into simplifying the locking instead of making it even more complicated? Since Al replaced i_mutex with i_rwsem you can now easily take that in shared mode. E.g. if you'd move to a direct I/O model closer to XFS, ocfs2 and NFS where you always take i_rwsem in shared mode you'll get the scalibility of the dioread_nolock case while no having to do these crazy dances, and you can also massively simplify the codebase. Similarly you can demote it from exclusive to shared after allocating blocks in the write path, and you'll end up with something way easier to understand.Unfortunately, in order to do this we need to extend the dioread_nolock handling for sub-page block sizes. (This is where we insert the allocated blocks into the extent maps marked uninitialized, and only converting the extent from uninitialized to initialized --- which today only works when the page size == block size.) This is on my todo list, but half of the problem is the mess caused by needing to iterate over the circularly linked buffer heads when there are multiple buffer heads covering the page. I was originally assuming that it would be easier to fix this after doing the bh -> iomap conversion, but it's in a while before I looked into this particular change. I can try to take a closer look again.... The main reason why I looked into this hack --- and I will be the first to agree it was a hack, is because I had a request to support the dioread_nolock scalability on a Little Endian PowerPC system which has 64k page sizes.quoted
Sorry for the rant, but I just had to dig into this code when looking at converting ext4 to the new DAX path, and my eyes still bleed..Yeah, I know, and I'm sorry. There's quite a bit of technical debt there, which I do want to clean up.
So I think what Christoph meant in this case is something like attached patch. That achieves more than your dirty hack in a much cleaner way. Beware, the patch is only compile-tested. Then there is the case of unlocked direct IO overwrites which we allow to run without inode_lock in dioread_nolock mode as well and that is more difficult to resolve (there lay the problems with blocksize < pagesize you speak about). Honza -- Jan Kara [off-list ref] SUSE Labs, CR
Attachments
- 0001-ext4-Allow-parallel-DIO-reads.patch [text/x-patch] 2655 bytes · preview