Thread (17 messages) 17 messages, 3 authors, 2018-11-20

Re: [PATCH 5/5] aio: add support for file based polled IO

From: Jens Axboe <axboe@kernel.dk>
Date: 2018-11-19 13:20:59
Also in: linux-fsdevel

On 11/18/18 6:57 PM, Dave Chinner wrote:
On Sat, Nov 17, 2018 at 04:53:17PM -0700, Jens Axboe wrote:
quoted
Needs further work, but this should work fine on normal setups
with a file system on a pollable block device.
What should work fine? I've got no idea what this patch is actually
providing....
Sorry, should have made it more clear that this one is just a test
patch to enable polled aio for files, it's not supposed to be
considered complete.
quoted
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/aio.c       | 2 ++
 fs/direct-io.c | 4 +++-
 fs/iomap.c     | 7 +++++--
 3 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 500da3ffc376..e02085fe10d7 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1310,6 +1310,8 @@ static struct block_device *aio_bdev_host(struct kiocb *req)
 
 	if (S_ISBLK(inode->i_mode))
 		return I_BDEV(inode);
+	else if (inode->i_sb && inode->i_sb->s_bdev)
+		return inode->i_sb->s_bdev;
XFS might be doing AIO to files on real-time device, not
inode->i_sb->s_bdev. So this may well be the wrong block device
for the IO being submitted.
See patch description, XFS doing AIO on a real-time device is not a
"normal setup".

It doesn't work for btrfs either, for instance. As far as I can tell, there
are a few routes that we can go here:

1) Have an fs ops that returns the bdev.
2) Set it after submit time, like we do in other spots. This won't work
   if we straddle devices.

I don't feel that strongly about it, and frankly I'm fine with just
dropping non-bdev support for now. The patch is provided for testing
for the 99% of use cases, which is a file system on top of a single
device.
quoted
diff --git a/fs/iomap.c b/fs/iomap.c
index 74c1f37f0fd6..4cf412b6230a 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -1555,6 +1555,7 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos,
 	struct page *page = ZERO_PAGE(0);
 	int flags = REQ_SYNC | REQ_IDLE;
 	struct bio *bio;
+	blk_qc_t qc;
 
 	bio = bio_alloc(GFP_KERNEL, 1);
 	bio_set_dev(bio, iomap->bdev);
@@ -1570,7 +1571,9 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos,
 	bio_set_op_attrs(bio, REQ_OP_WRITE, flags);
 
 	atomic_inc(&dio->ref);
-	return submit_bio(bio);
+	qc = submit_bio(bio);
+	WRITE_ONCE(dio->iocb->ki_blk_qc, qc);
+	return qc;
 }
Why is this added to sub-block zeroing IO calls? It gets overwritten
by the data IO submission, so this value is going to change as the
IO progresses. What does making these partial IOs visible provide,
especially as they then get overwritten by the next submissions?
Indeed, how does one wait on all IOs in the DIO to complete if we
are only tracking one of many?
That does look like a mistake, the zeroing should just be ignored.
We only care about the actual IO.

-- 
Jens Axboe
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help