Re: Any bio_clone_slow() implementation which doesn't share bi_io_vec?
From: "hch@infradead.org" <hch@infradead.org>
Date: 2021-11-23 14:28:31
Also in:
dm-devel, linux-fsdevel
From: "hch@infradead.org" <hch@infradead.org>
Date: 2021-11-23 14:28:31
Also in:
dm-devel, linux-fsdevel
On Tue, Nov 23, 2021 at 11:39:11AM +0000, Johannes Thumshirn wrote:
I think we have to differentiate two cases here: A "regular" REQ_OP_ZONE_APPEND bio and a RAID stripe REQ_OP_ZONE_APPEND bio. The 1st one (i.e. the regular REQ_OP_ZONE_APPEND bio) can't be split because we cannot guarantee the order the device writes the data to disk. For the RAID stripe bio we can split it into the two (or more) parts that will end up on _different_ devices. All we need to do is a) ensure it doesn't cross the device's zone append limit and b) clamp all bi_iter.bi_sector down to the start of the target zone, a.k.a sticking to the rules of REQ_OP_ZONE_APPEND.
Exactly. A stacking driver must never split a REQ_OP_ZONE_APPEND bio. But the file system itself can of course split it as long as each split off bio has it's own bi_end_io handler to record where it has been written to.