Re: [PATCH 4/6] blk-mq: use EWMA to estimate congestion threshold

[PATCH 0/6] blk-mq: introduce congestion control · Ming Lei <hidden> · 2017-07-11
[PATCH 1/6] xen-blkfront: avoid to use start/stop queue · Ming Lei <hidden> · 2017-07-11
Re: [PATCH 1/6] xen-blkfront: avoid to use start/stop queue · Bart Van Assche <hidden> · 2017-07-11
Re: [PATCH 1/6] xen-blkfront: avoid to use start/stop queue · Ming Lei <hidden> · 2017-07-12
Re: [PATCH 1/6] xen-blkfront: avoid to use start/stop queue · Ming Lei <hidden> · 2017-07-12
Re: [PATCH 1/6] xen-blkfront: avoid to use start/stop queue · Konrad Rzeszutek Wilk <hidden> · 2017-07-11
Re: [PATCH 1/6] xen-blkfront: avoid to use start/stop queue · Ming Lei <hidden> · 2017-07-12
Re: [PATCH 1/6] xen-blkfront: avoid to use start/stop queue · Roger Pau Monné <roger.pau@citrix.com> · 2017-07-11
Re: [PATCH 1/6] xen-blkfront: avoid to use start/stop queue · Ming Lei <hidden> · 2017-07-12
[PATCH 2/6] SCSI: use blk_mq_run_hw_queues() in scsi_kick_queue() · Ming Lei <hidden> · 2017-07-11
Re: [PATCH 2/6] SCSI: use blk_mq_run_hw_queues() in scsi_kick_queue() · Bart Van Assche <hidden> · 2017-07-11
Re: [PATCH 2/6] SCSI: use blk_mq_run_hw_queues() in scsi_kick_queue() · Ming Lei <hidden> · 2017-07-12
Re: [PATCH 2/6] SCSI: use blk_mq_run_hw_queues() in scsi_kick_queue() · Bart Van Assche <hidden> · 2017-07-12
Re: [PATCH 2/6] SCSI: use blk_mq_run_hw_queues() in scsi_kick_queue() · Ming Lei <hidden> · 2017-07-13
Re: [PATCH 2/6] SCSI: use blk_mq_run_hw_queues() in scsi_kick_queue() · Bart Van Assche <hidden> · 2017-07-13
[PATCH 3/6] blk-mq: send the request to dispatch list if direct issue returns busy · Ming Lei <hidden> · 2017-07-11
Re: [PATCH 3/6] blk-mq: send the request to dispatch list if direct issue returns busy · Bart Van Assche <hidden> · 2017-07-11
Re: [PATCH 3/6] blk-mq: send the request to dispatch list if direct issue returns busy · Ming Lei <hidden> · 2017-07-12
[PATCH 4/6] blk-mq: use EWMA to estimate congestion threshold · Ming Lei <hidden> · 2017-07-11
Re: [PATCH 4/6] blk-mq: use EWMA to estimate congestion threshold · Jens Axboe <axboe@kernel.dk> · 2017-07-11
Re: [PATCH 4/6] blk-mq: use EWMA to estimate congestion threshold · Ming Lei <hidden> · 2017-07-12
Re: [PATCH 4/6] blk-mq: use EWMA to estimate congestion threshold · Bart Van Assche <hidden> · 2017-07-12
Re: [PATCH 4/6] blk-mq: use EWMA to estimate congestion threshold · Ming Lei <hidden> · 2017-07-13
Re: [PATCH 4/6] blk-mq: use EWMA to estimate congestion threshold · Bart Van Assche <hidden> · 2017-07-13
Re: [PATCH 4/6] blk-mq: use EWMA to estimate congestion threshold · Ming Lei <hidden> · 2017-07-13
Re: [PATCH 4/6] blk-mq: use EWMA to estimate congestion threshold · Bart Van Assche <hidden> · 2017-07-13
Re: [PATCH 4/6] blk-mq: use EWMA to estimate congestion threshold · Jens Axboe <axboe@fb.com> · 2017-07-11
Re: [PATCH 4/6] blk-mq: use EWMA to estimate congestion threshold · Ming Lei <hidden> · 2017-07-12
Re: [PATCH 4/6] blk-mq: use EWMA to estimate congestion threshold · Bart Van Assche <hidden> · 2017-07-11
Re: [PATCH 4/6] blk-mq: use EWMA to estimate congestion threshold · Ming Lei <hidden> · 2017-07-12
[PATCH 5/6] blk-mq: introduce basic congestion control · Ming Lei <hidden> · 2017-07-11
[PATCH 6/6] blk-mq: unexport APIs for start/stop queues · Ming Lei <hidden> · 2017-07-11

From: Ming Lei <hidden>
Date: 2017-07-13 10:43:42

On Wed, Jul 12, 2017 at 03:39:14PM +0000, Bart Van Assche wrote:

On Wed, 2017-07-12 at 10:30 +0800, Ming Lei wrote:

quoted

On Tue, Jul 11, 2017 at 12:25:16PM -0600, Jens Axboe wrote:

quoted

What happens with fluid congestion boundaries, with shared tags?

The approach in this patch should work, but the threshold may not
be accurate in this way, one simple method is to use the average
tag weight in EWMA, like this:

	sbitmap_weight() / hctx->tags->active_queues

Hello Ming,

That approach would result in a severe performance degradation. "active_queues"
namely represents the number of queues against which I/O ever has been queued.
If e.g. 64 LUNs would be associated with a single SCSI host and all 64 LUNs are
responding and if the queue depth would also be 64 then the approach you
proposed will reduce the effective queue depth per LUN from 64 to 1.

No, this approach does _not_ reduce the effective queue depth, it only
stops the queue for a while when the queue is busy enough.

In this case, there may not have congestion because for blk-mq at most allows
to assign queue_depth/active_queues tags to each LUN, please see hctx_may_queue().
Then get_driver_tag() can only allow to return one pending tag at most to the
request_queue(LUN).

The algorithm in this patch only starts to work when congestion happens,
that said it is only run when BLK_STS_RESOURCE is returned from .queue_rq().
This approach is for avoiding to dispatch requests to one busy queue
unnecessarily, so that we don't need to heat CPU unnecessarily, and
merge gets improved meantime.

-- 
Ming

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help