Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
From: Linus Walleij <hidden>
Date: 2016-10-28 21:09:10
Also in:
lkml
On Fri, Oct 28, 2016 at 5:29 PM, Christoph Hellwig [off-list ref] wrot= e:
On Fri, Oct 28, 2016 at 11:32:21AM +0200, Linus Walleij wrote:quoted
So I'm not just complaining by the way, I'm trying to fix this. Also Bartlomiej from Samsung has done some stabs at switching MMC/SD to blk-mq. I just rebased my latest stab at a na=C4=ABve switch to blk-m=
q
quoted
to v4.9-rc2 with these results. The patch to enable MQ looks like this: https://git.kernel.org/cgit/linux/kernel/git/linusw/linux-stericsson.git=
/commit/?h=3Dmmc-mq&id=3D8f79b527e2e854071d8da019451da68d4753f71d
quoted
I run these tests directly after boot with cold caches. The results are consistent: I ran the same commands 10 times in a row.A couple comments from a quick look over the patch: In the changelog you complain: ". Lack of front- and back-end merging in the MQ block layer creating several small requests instead of a few large ones." In blk-mq merging is controller by the BLK_MQ_F_SHOULD_MERGE and BLK_MQ_F_SG_MERGE flags. You set the former, but not the latter. BLK_MQ_F_SG_MERGE controls wether multiple physical contiguous pages get merged into a single segment. For a dd after a fresh boot that is probably very common. Except for the polarity of the merge flags the basic merge functionality between the legacy and blk-mq path should be the same, and if they aren't you've found a bug we need to address.
Aha OK I will make sure to set both flags next time. (I will also stop guessing about that as a cause since that part probably works.)
You also say that you disable the pipelining. How much of a performance gain did this feature give when added? How much does just removing that on it's own cost you?
Interestingly, the original commit doesn't say. http://marc.info/?l=3Dlinaro-dev&m=3D137645684811479&w=3D2 It however dependends the cache architecture of the machine how much is won. The heavier the cache flushes, the more it gains. I guess I need to make a patch removing that mechanism to bench it. It's pretty hard to get rid of because it goes really deep into the MMC subsystem. It's massaged in like a schampoo.
While I think that features is rather messy and should be avoided if possible I don't see how it's impossible to implement in blk-mq.
It's probably possible. What I discussed with Arnd was to let the blk-mq core call out to these pre-request and post-request hooks on new requests in parallel with processing a request or a queue of requests. I.e. add .prep_request() and .unprep_request() callbacks to struct blk_mq_ops. I tried to understand if the existing .init_request and .exit_request callbacks could be used. But as I understand it they are only used to allocate and prepare the extra per-request-associated memory and state, and does not have access to the request per se, so it doesn't know anything about the actual request when .init_request() is called. So we're looking for something called whenever the contents of a request are done, right before queueing it, and right after dequeueing it after being served.
If you just increase your queue depth and use the old scheme you should get it - if you currently can't handle the second command for some reason (i.e. the special request magic) you can just return BLK_MQ_RQ_QUEUE_BUSY from the queue_rq function.
Bartlomiejs patch set did that, but I haven't been able to reproduce it. I will try to make a clean patch in the spirit of his. Yours, Linus Walleij