Thread (61 messages) 61 messages, 8 authors, 2020-07-21

Re: [PATCH v3 4/4] io_uring: add support for zone-append

From: Jens Axboe <axboe@kernel.dk>
Date: 2020-07-05 21:12:59
Also in: io-uring, linux-fsdevel, lkml

On 7/5/20 3:09 PM, Matthew Wilcox wrote:
On Sun, Jul 05, 2020 at 03:00:47PM -0600, Jens Axboe wrote:
quoted
On 7/5/20 12:47 PM, Kanchan Joshi wrote:
quoted
From: Selvakumar S <redacted>

For zone-append, block-layer will return zone-relative offset via ret2
of ki_complete interface. Make changes to collect it, and send to
user-space using cqe->flags.

Signed-off-by: Selvakumar S <redacted>
Signed-off-by: Kanchan Joshi <redacted>
Signed-off-by: Nitesh Shetty <redacted>
Signed-off-by: Javier Gonzalez <redacted>
---
 fs/io_uring.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 155f3d8..cbde4df 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -402,6 +402,8 @@ struct io_rw {
 	struct kiocb			kiocb;
 	u64				addr;
 	u64				len;
+	/* zone-relative offset for append, in sectors */
+	u32			append_offset;
 };
I don't like this very much at all. As it stands, the first cacheline
of io_kiocb is set aside for request-private data. io_rw is already
exactly 64 bytes, which means that you're now growing io_rw beyond
a cacheline and increasing the size of io_kiocb as a whole.

Maybe you can reuse io_rw->len for this, as that is only used on the
submission side of things.
I'm surprised you aren't more upset by the abuse of cqe->flags for the
address.
Yeah, it's not great either, but we have less leeway there in terms of
how much space is available to pass back extra data.
What do you think to my idea of interpreting the user_data as being a
pointer to somewhere to store the address?  Obviously other things
can be stored after the address in the user_data.
I don't like that at all, as all other commands just pass user_data
through. This means the application would have to treat this very
differently, and potentially not have a way to store any data for
locating the original command on the user side.
Or we could have a separate flag to indicate that is how to interpret
the user_data.
I'd be vehemently against changing user_data in any shape or form.
It's to be passed through from sqe to cqe, that's how the command flow
works. It's never kernel generated, and it's also used as a key for
command lookup.

-- 
Jens Axboe
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help