Re: [PATCH V2] blk-mq: avoid to starve tag allocation after allocation process migrates
From: Omar Sandoval <osandov@osandov.com>
Date: 2018-05-23 22:47:51
On Thu, May 24, 2018 at 06:09:51AM +0800, Ming Lei wrote:
On Thu, May 24, 2018 at 1:48 AM, Omar Sandoval [off-list ref] wrote:quoted
On Wed, May 23, 2018 at 05:32:31PM +0800, Ming Lei wrote:quoted
On Tue, May 22, 2018 at 09:59:17PM -0600, Jens Axboe wrote:quoted
On 5/19/18 1:44 AM, Ming Lei wrote:quoted
When the allocation process is scheduled back and the mapped hw queue is changed, do one extra wake up on orignal queue for compensating wake up miss, so other allocations on the orignal queue won't be starved. This patch fixes one request allocation hang issue, which can be triggered easily in case of very low nr_request.Trying to think of better ways we can fix this, but I don't see any right now. Getting rid of the wake_up_nr() kills us on tons of tasks waiting.I am not sure if I understand your point, but this issue isn't related with wake_up_nr() actually, and it can be reproduced after reverting 4e5dff41be7b5201c1c47c (blk-mq: improve heavily contended tag case). All tasks in current sbq_wait_state may be scheduled to other CPUs, and there may still be tasks waiting for allocation from this sbitmap_queue, and the root cause is about cross-queue allocation, as you said, there are too many queues, :-)I don't follow. Your description of the problem was that we have two waiters and only wake up one, which doesn't in turn allocate and free a tag and wake up the second waiter. Changing it back to wake_up_nr() eliminates that problem. And if waking up everything doesn't fix it, how does your fix of waking up a few extra tasks fix it?What matters is that this patch wakes up the previous sbq, let's see if from another view: 1) still 2 hw queues, nr_requests are 2, and wake_batch is one 2) there are 3 waiters on hw queue 0 3) two in-flight requests in hw queue 0 are completed, and only two waiters of 3 are waken up because of wake_batch, but both the two waiters can be scheduled to another CPU and cause to switch to hw queue 1 4) then the 3rd waiter will wait for ever, since no in-flight request is in hw queue 0 any more. 5) this patch fixes it by the fake wakeup when waiter is scheduled to another hw queue The issue can be understood a bit easier if we just forget sbq_wait_state and focus on sbq, :-)
Okay, I see, although if I'm understanding correctly, it has everything to do with sbq_wait_state. If we only had one waitqueue, then wake_up() would always wake up all of the waiters, but because we have them spread out over multiple waitqueues, we have to call sbq_wake_up()/sbitmap_queue_wake_up() to do the wake up on the other waitqueue.