Re: [PATCH V11 14/19] block: handle non-cluster bio out of blk_bio_segment_split
From: Ming Lei <hidden>
Date: 2018-11-21 15:37:27
Also in:
dm-devel, linux-bcache, linux-block, linux-btrfs, linux-fsdevel, linux-mm, linux-raid, linux-xfs, lkml
On Wed, Nov 21, 2018 at 03:33:55PM +0100, Christoph Hellwig wrote:
quoted
+ non-cluster.oDo we really need a new source file for these few functions?quoted
default: + if (!blk_queue_cluster(q)) { + blk_queue_non_cluster_bio(q, bio); + return;I'd name this blk_bio_segment_split_singlepage or similar.
OK.
quoted
+static __init int init_non_cluster_bioset(void) +{ + WARN_ON(bioset_init(&non_cluster_bio_set, BIO_POOL_SIZE, 0, + BIOSET_NEED_BVECS)); + WARN_ON(bioset_integrity_create(&non_cluster_bio_set, BIO_POOL_SIZE)); + WARN_ON(bioset_init(&non_cluster_bio_split, BIO_POOL_SIZE, 0, 0));Please only allocate the resources once a queue without the cluster flag is registered, there are only very few modern drivers that do that.
OK.
quoted
+static void non_cluster_end_io(struct bio *bio) +{ + struct bio *bio_orig = bio->bi_private; + + bio_orig->bi_status = bio->bi_status; + bio_endio(bio_orig); + bio_put(bio); +}Why can't we use bio_chain for the split bios?
The parent bio is multi-page bvec, we can't submit it for non-cluster.
quoted
+ bio_for_each_segment(from, *bio_orig, iter) { + if (i++ < max_segs) + sectors += from.bv_len >> 9; + else + break; + }The easy to read way would be: bio_for_each_segment(from, *bio_orig, iter) { if (i++ == max_segs) break; sectors += from.bv_len >> 9; }
OK.
quoted
+ if (sectors < bio_sectors(*bio_orig)) { + bio = bio_split(*bio_orig, sectors, GFP_NOIO, + &non_cluster_bio_split); + bio_chain(bio, *bio_orig); + generic_make_request(*bio_orig); + *bio_orig = bio;I don't think this is very efficient, as this means we now clone the bio twice, first to split it at the sector boundary, and then again when converting it to single-page bio_vec.
That is exactly what bounce code does. The problem for both bounce and non-cluster is same actually because the bvec table itself has to be changed.
quoted hunk ↗ jump to hunk
I think this could be something like this (totally untested):diff --git a/block/non-cluster.c b/block/non-cluster.c index 9c2910be9404..60389f275c43 100644 --- a/block/non-cluster.c +++ b/block/non-cluster.c@@ -13,58 +13,59 @@ #include "blk.h" -static struct bio_set non_cluster_bio_set, non_cluster_bio_split; +static struct bio_set non_cluster_bio_set; static __init int init_non_cluster_bioset(void) { WARN_ON(bioset_init(&non_cluster_bio_set, BIO_POOL_SIZE, 0, BIOSET_NEED_BVECS)); WARN_ON(bioset_integrity_create(&non_cluster_bio_set, BIO_POOL_SIZE)); - WARN_ON(bioset_init(&non_cluster_bio_split, BIO_POOL_SIZE, 0, 0)); return 0; } __initcall(init_non_cluster_bioset); -static void non_cluster_end_io(struct bio *bio) -{ - struct bio *bio_orig = bio->bi_private; - - bio_orig->bi_status = bio->bi_status; - bio_endio(bio_orig); - bio_put(bio); -} - void blk_queue_non_cluster_bio(struct request_queue *q, struct bio **bio_orig) { - struct bio *bio; struct bvec_iter iter; - struct bio_vec from; - unsigned i = 0; - unsigned sectors = 0; - unsigned short max_segs = min_t(unsigned short, BIO_MAX_PAGES, - queue_max_segments(q)); + struct bio *bio; + struct bio_vec bv; + unsigned short max_segs, segs = 0; + + bio = bio_alloc_bioset(GFP_NOIO, bio_segments(*bio_orig), + &non_cluster_bio_set);
bio_segments(*bio_orig) may be > 256, so bio_alloc_bioset() may fail. Thanks, Ming