Re: [PATCH 5/5] aio: add support for file based polled IO
From: Jens Axboe <axboe@kernel.dk>
Date: 2018-11-19 13:20:59
Also in:
linux-fsdevel
On 11/18/18 6:57 PM, Dave Chinner wrote:
On Sat, Nov 17, 2018 at 04:53:17PM -0700, Jens Axboe wrote:quoted
Needs further work, but this should work fine on normal setups with a file system on a pollable block device.What should work fine? I've got no idea what this patch is actually providing....
Sorry, should have made it more clear that this one is just a test patch to enable polled aio for files, it's not supposed to be considered complete.
quoted
Signed-off-by: Jens Axboe <axboe@kernel.dk> --- fs/aio.c | 2 ++ fs/direct-io.c | 4 +++- fs/iomap.c | 7 +++++-- 3 files changed, 10 insertions(+), 3 deletions(-)diff --git a/fs/aio.c b/fs/aio.c index 500da3ffc376..e02085fe10d7 100644 --- a/fs/aio.c +++ b/fs/aio.c@@ -1310,6 +1310,8 @@ static struct block_device *aio_bdev_host(struct kiocb *req) if (S_ISBLK(inode->i_mode)) return I_BDEV(inode); + else if (inode->i_sb && inode->i_sb->s_bdev) + return inode->i_sb->s_bdev;XFS might be doing AIO to files on real-time device, not inode->i_sb->s_bdev. So this may well be the wrong block device for the IO being submitted.
See patch description, XFS doing AIO on a real-time device is not a "normal setup". It doesn't work for btrfs either, for instance. As far as I can tell, there are a few routes that we can go here: 1) Have an fs ops that returns the bdev. 2) Set it after submit time, like we do in other spots. This won't work if we straddle devices. I don't feel that strongly about it, and frankly I'm fine with just dropping non-bdev support for now. The patch is provided for testing for the 99% of use cases, which is a file system on top of a single device.
quoted
diff --git a/fs/iomap.c b/fs/iomap.c index 74c1f37f0fd6..4cf412b6230a 100644 --- a/fs/iomap.c +++ b/fs/iomap.c@@ -1555,6 +1555,7 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, struct page *page = ZERO_PAGE(0); int flags = REQ_SYNC | REQ_IDLE; struct bio *bio; + blk_qc_t qc; bio = bio_alloc(GFP_KERNEL, 1); bio_set_dev(bio, iomap->bdev);@@ -1570,7 +1571,9 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, bio_set_op_attrs(bio, REQ_OP_WRITE, flags); atomic_inc(&dio->ref); - return submit_bio(bio); + qc = submit_bio(bio); + WRITE_ONCE(dio->iocb->ki_blk_qc, qc); + return qc; }Why is this added to sub-block zeroing IO calls? It gets overwritten by the data IO submission, so this value is going to change as the IO progresses. What does making these partial IOs visible provide, especially as they then get overwritten by the next submissions? Indeed, how does one wait on all IOs in the DIO to complete if we are only tracking one of many?
That does look like a mistake, the zeroing should just be ignored. We only care about the actual IO. -- Jens Axboe