Re: [PATCH] block, bfq: dispatch request to prevent queue stalling after the... | linux-block

Re: [PATCH] block, bfq: dispatch request to prevent queue stalling after the request completion

From: Hou Tao <hidden>
Date: 2017-07-12 10:37:15


On 2017/7/12 17:41, Paolo Valente wrote:

quoted

Il giorno 11 lug 2017, alle ore 15:58, Hou Tao [off-list ref] ha scritto:

There are mq devices (eg., virtio-blk, nbd and loopback) which don't
invoke blk_mq_run_hw_queues() after the completion of a request.
If bfq is enabled on these devices and the slice_idle attribute or
strict_guarantees attribute is set as zero,

I guess you meant slice_idle > 0 or strict_guarantees equal to 1 here.

Sorry for the typo, I mean slice_idle = 0 or strict_guarantees = 1.
Setting slice_idle as 0 alone could also trigger the latency problem.

quoted

it is possible that
after a request completion the remaining requests of busy bfq queue
will stalled in the bfq schedule until a new request arrives.

To fix the scheduler latency problem, we need to check whether or not
all issued requests have completed and dispatch more requests to driver
if there is no request in driver.

Signed-off-by: Hou Tao <redacted>

Thanks for this fix.  My only (possible) concern is whether there
would be some more coherent and possibly more efficient solution to
this problem, outside BFQ.  For example, would it make sense to call
blk_mq_run_hw_queues(), after a request completion, from the offended
devices too?  This would make the behavior of these devices coherent
with that of the other devices.  Unfortunately I have no sound answer
to such a question.  Apart from this concern:

Reviewed-by: Paolo Valente <redacted>

Thanks for your review.

The inconsistencies between the different mq drivers also make me confused.
I don't known the reason why scsi-mq dispatches the remaining requests
after each request completion and virtio-blk does not.

I'm not sure whether or not fixing in MQ is a better idea, but I has a proposal.
If the BLK_MQ_S_SCHED_RESTART flag has been set, MQ would invoke blk_mq_run_hw_queues()
after the request completion. The BLK_MQ_S_SCHED_RESTART flag will be set by
blk_mq_sched_dispatch_requests() when it finds out that there are residual requests in its
dispatch list. We can let .dispatch_request() return a flag to set the BLK_MQ_S_SCHED_RESTART
flag if the underlying scheduler needs it, so after the request completion, the MQ core will
pull the remaining requests from BFQ.

Regards,
Tao

quoted

The problem can be reproduced by running the following script
on a virtio-blk device with nr_hw_queues as 1:

#!/bin/sh

dev=vdb
# mount point for dev
mp=/tmp/mnt
cd $mp

job=strict.job
cat <<EOF > $job
[global]
direct=1
bs=4k
size=256M
rw=write
ioengine=libaio
iodepth=128
runtime=5
time_based

[1]
filename=1.data

[2]
new_group
filename=2.data
EOF

echo bfq > /sys/block/$dev/queue/scheduler
echo 1 > /sys/block/$dev/queue/iosched/strict_guarantees
fio $job
---
block/bfq-iosched.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 12bbc6b..94cd682 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c

@@ -4293,6 +4293,9 @@ static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)

			bfq_bfqq_expire(bfqd, bfqq, false,
					BFQQE_NO_MORE_REQUESTS);
	}
+
+	if (!bfqd->rq_in_driver)
+		bfq_schedule_dispatch(bfqd);
}

static void bfq_put_rq_priv_body(struct bfq_queue *bfqq)
-- 
2.5.0

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help