Thread (31 messages) 31 messages, 6 authors, 2022-12-13

Re: [PATCH v5 02/10] block: Add copy offload support infrastructure

From: Nitesh Shetty <hidden>
Date: 2022-11-23 12:14:01
Also in: dm-devel, linux-fsdevel, linux-nvme, lkml

On Wed, Nov 23, 2022 at 04:04:18PM +0800, Ming Lei wrote:
On Wed, Nov 23, 2022 at 11:28:19AM +0530, Nitesh Shetty wrote:
quoted
Introduce blkdev_issue_copy which supports source and destination bdevs,
and an array of (source, destination and copy length) tuples.
Introduce REQ_COPY copy offload operation flag. Create a read-write
bio pair with a token as payload and submitted to the device in order.
Read request populates token with source specific information which
is then passed with write request.
This design is courtesy Mikulas Patocka's token based copy
I thought this patchset is just for enabling copy command which is
supported by hardware. But turns out it isn't, because blk_copy_offload()
still submits read/write bios for doing the copy.

I am just wondering why not let copy_file_range() cover this kind of copy,
and the framework has been there.
Main goal was to enable copy command, but community suggested to add
copy emulation as well.

blk_copy_offload - actually issues copy command in driver layer.
The way read/write BIOs are percieved is different for copy offload.
In copy offload we check REQ_COPY flag in NVMe driver layer to issue
copy command. But we did missed it to add in other driver's, where they
might be treated as normal READ/WRITE.

blk_copy_emulate - is used if we fail or if device doesn't support native
copy offload command. Here we do READ/WRITE. Using copy_file_range for
emulation might be possible, but we see 2 issues here.
1. We explored possibility of pulling dm-kcopyd to block layer so that we 
can readily use it. But we found it had many dependecies from dm-layer.
So later dropped that idea.
2. copy_file_range, for block device atleast we saw few check's which fail
it for raw block device. At this point I dont know much about the history of
why such check is present.
When I was researching pipe/splice code for supporting ublk zero copy[1], I
have got idea for async copy_file_range(), such as: io uring based
direct splice, user backed intermediate buffer, still zero copy, if these
ideas are finally implemented, we could get super-fast generic offload copy,
and bdev copy is really covered too.

[1] https://lore.kernel.org/linux-block/20221103085004.1029763-1-ming.lei@redhat.com/ (local)
Seems interesting, We will take a look into this.
Thanks,
Nitesh

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help