Thread (38 messages) 38 messages, 5 authors, 2021-07-07

Re: [bug report] shared tags causes IO hang and performance drop

From: John Garry <hidden>
Date: 2021-04-26 17:05:52
Also in: linux-scsi

On 26/04/2021 17:03, Ming Lei wrote:
quoted
For both hostwide and non-hostwide tags, we have standalone sched tags and
request pool per hctx when q->nr_hw_queues > 1.
driver tags is shared for hostwide tags.
quoted
quoted
That is why you observe that scheduler tag exhaustion
is easy to trigger in case of non-hostwide tags.

I'd suggest to add one per-request-queue sched tags, and make all hctxs
sharing it, just like what you did for driver tag.
That sounds reasonable.

But I don't see how this is related to hostwide tags specifically, but
rather just having q->nr_hw_queues > 1, which NVMe PCI and some other SCSI
MQ HBAs have (without using hostwide tags).
Before hostwide tags, the whole scheduler queue depth should be 256.
After hostwide tags, the whole scheduler queue depth becomes 256 *
nr_hw_queues. But the driver tag queue depth is_not_  changed.
Fine.
More requests come and are tried to dispatch to LLD and can't succeed
because of limited driver tag depth, and CPU utilization could be increased.
Right, maybe this is a problem.

I quickly added some debug, and see that 
__blk_mq_get_driver_tag()->__sbitmap_queue_get() fails ~7% for hostwide 
tags and 3% for non-hostwide tags.

Having it fail at all for non-hostwide tags seems a bit dubious... 
here's the code for deciding the rq sched tag depth:

q->nr_requests = 2 * min(q->tags_set->queue_depth [128], BLK_DEV_MAX_RQ 
[128])

So we get 256 for our test scenario, which is appreciably bigger than 
q->tags_set->queue_depth, so the failures make sense.

Anyway, I'll look at adding code for a per-request queue sched tags to 
see if it helps. But I would plan to continue to use a per hctx sched 
request pool.

Thanks,
John
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help