Thread (14 messages) 14 messages, 3 authors, 2018-08-03

Re: [RFC] blk-mq: clean up the hctx restart

From: jianchao.wang <hidden>
Date: 2018-08-01 02:17:38
Also in: lkml

Hi Ming

Thanks for your kindly response.

On 07/31/2018 02:16 PM, Ming Lei wrote:
On Tue, Jul 31, 2018 at 01:19:42PM +0800, jianchao.wang wrote:
quoted
Hi Ming

On 07/31/2018 12:58 PM, Ming Lei wrote:
quoted
On Tue, Jul 31, 2018 at 12:02:15PM +0800, Jianchao Wang wrote:
quoted
Currently, we will always set SCHED_RESTART whenever there are
requests in hctx->dispatch, then when request is completed and
freed the hctx queues will be restarted to avoid IO hang. This
is unnecessary most of time. Especially when there are lots of
LUNs attached to one host, the RR restart loop could be very
expensive.
The big RR restart loop has been killed in the following commit:

commit 97889f9ac24f8d2fc8e703ea7f80c162bab10d4d
Author: Ming Lei [off-list ref]
Date:   Mon Jun 25 19:31:48 2018 +0800

    blk-mq: remove synchronize_rcu() from blk_mq_del_queue_tag_set()
Oh, sorry, I didn't look into this patch due to its title when iterated the mail list,
therefore I didn't realize the RR restart loop has already been killed. :)

The RR restart loop could ensure the fairness of sharing some LLDD resource,
not just avoid IO hung. Is it OK to kill it totally ?
Yeah, it is, also the fairness might be improved a bit by the way in
commit 97889f9ac24f8d2fc, especially inside driver tag allocation
algorithem.
Would you mind to detail more here ?

Regarding the driver tag case:
For example:

q_a         q_b        q_c       q_d
hctx0       hctx0      hctx0     hctx0

                 tags

Total number of tags is 32
All of these 4 q are active.

So every q has 8 tags.

If all of these 4 q have used up their 8 tags, they have to wait.

When part of the in-flight requests q_a are completed, tags are freed.
but the __sbq_wake_up doesn't wake up the q_a, it may wake up q_b.
However, due to the limits in hctx_may_queue, q_b still cannot get the
tags. The RR restart also will not wake up q_a.
This is unfair for q_a.

When we remove RR restart fashion, at least, the q_a will be waked up by
the hctx restart.
Is this the improvement of fairness you said in driver tag allocation ?

Think further, it seems that it only works for case with io scheduler.
w/o io scheduler, tasks will wait in blk_mq_get_request. restart hctx will
not work there.

Thanks
Jianchao
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help