Re: [PATCH v3 4/4] io_uring: add support for zone-append
From: Jens Axboe <axboe@kernel.dk>
Date: 2020-07-05 21:12:59
Also in:
io-uring, linux-fsdevel, lkml
On 7/5/20 3:09 PM, Matthew Wilcox wrote:
On Sun, Jul 05, 2020 at 03:00:47PM -0600, Jens Axboe wrote:quoted
On 7/5/20 12:47 PM, Kanchan Joshi wrote:quoted
From: Selvakumar S <redacted> For zone-append, block-layer will return zone-relative offset via ret2 of ki_complete interface. Make changes to collect it, and send to user-space using cqe->flags. Signed-off-by: Selvakumar S <redacted> Signed-off-by: Kanchan Joshi <redacted> Signed-off-by: Nitesh Shetty <redacted> Signed-off-by: Javier Gonzalez <redacted> --- fs/io_uring.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-)diff --git a/fs/io_uring.c b/fs/io_uring.c index 155f3d8..cbde4df 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c@@ -402,6 +402,8 @@ struct io_rw { struct kiocb kiocb; u64 addr; u64 len; + /* zone-relative offset for append, in sectors */ + u32 append_offset; };I don't like this very much at all. As it stands, the first cacheline of io_kiocb is set aside for request-private data. io_rw is already exactly 64 bytes, which means that you're now growing io_rw beyond a cacheline and increasing the size of io_kiocb as a whole. Maybe you can reuse io_rw->len for this, as that is only used on the submission side of things.I'm surprised you aren't more upset by the abuse of cqe->flags for the address.
Yeah, it's not great either, but we have less leeway there in terms of how much space is available to pass back extra data.
What do you think to my idea of interpreting the user_data as being a pointer to somewhere to store the address? Obviously other things can be stored after the address in the user_data.
I don't like that at all, as all other commands just pass user_data through. This means the application would have to treat this very differently, and potentially not have a way to store any data for locating the original command on the user side.
Or we could have a separate flag to indicate that is how to interpret the user_data.
I'd be vehemently against changing user_data in any shape or form. It's to be passed through from sqe to cqe, that's how the command flow works. It's never kernel generated, and it's also used as a key for command lookup. -- Jens Axboe