Re: [bug report] shared tags causes IO hang and performance drop
From: John Garry <hidden>
Date: 2021-04-27 10:19:49
Also in:
linux-scsi
On 27/04/2021 10:52, Ming Lei wrote:
quoted
BTW, for the performance issue which Yanhui witnessed with megaraid sas, do you think it may because of the IO sched tags issue of total sched tag depth growing vs driver tags?I think it is highly possible. Will you work a patch to convert to per-request-queue sched tag?
Sure, I'm just hacking now to see what difference it can make to performance. Early results look promising...
quoted
Are there lots of LUNs? I can imagine that megaraid sas has much larger can_queue than scsi_debug:)No, there are just two LUNs, the 1st LUN is one commodity SSD(queue depth is 32) and the performance issue is reported on this LUN, another is one HDD(queue depth is 256) which is root disk, but the megaraid host tag depth is 228, another weird setting. But the issue still can be reproduced after we set 2nd LUN's depth as 64 for avoiding driver tag contention.
BTW, one more thing which Kashyap and I looked at when initially developing the hostwide tag support was the wait struct usage in tag exhaustion scenario: https://lore.kernel.org/linux-block/ecaeccf029c6fe377ebd4f30f04df9f1@mail.gmail.com/ (local) IIRC, we looked at a "hostwide" wait_index - it didn't seem to make a difference then, and we didn't end up make any changes here, but still worth remembering. Thanks, John