Thread (60 messages) 60 messages, 6 authors, 2018-01-29

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

From: Ming Lei <hidden>
Date: 2018-01-19 23:57:29
Also in: dm-devel, lkml

On Fri, Jan 19, 2018 at 09:23:35AM -0700, Jens Axboe wrote:
On 1/19/18 9:13 AM, Mike Snitzer wrote:
quoted
On Fri, Jan 19 2018 at 10:48am -0500,
Jens Axboe [off-list ref] wrote:
quoted
On 1/19/18 8:40 AM, Ming Lei wrote:
quoted
quoted
quoted
quoted
Where does the dm STS_RESOURCE error usually come from - what's exact
resource are we running out of?
It is from blk_get_request(underlying queue), see
multipath_clone_and_map().
That's what I thought. So for a low queue depth underlying queue, it's
quite possible that this situation can happen. Two potential solutions
I see:

1) As described earlier in this thread, having a mechanism for being
   notified when the scarce resource becomes available. It would not
   be hard to tap into the existing sbitmap wait queue for that.

2) Have dm set BLK_MQ_F_BLOCKING and just sleep on the resource
   allocation. I haven't read the dm code to know if this is a
   possibility or not.
Right, #2 is _not_ the way forward.  Historically request-based DM used
its own mempool for requests, this was to be able to have some measure
of control and resiliency in the face of low memory conditions that
might be affecting the broader system.

Then Christoph switched over to adding per-request-data; which ushered
in the use of blk_get_request using ATOMIC allocations.  I like the
result of that line of development.  But taking the next step of setting
BLK_MQ_F_BLOCKING is highly unfortunate (especially in that this
dm-mpath.c code is common to old .request_fn and blk-mq, at least the
call to blk_get_request is).  Ultimately dm-mpath like to avoid blocking
for a request because for this dm-mpath device we have multiple queues
to allocate from if need be (provided we have an active-active storage
network topology).
If you can go to multiple devices, obviously it should not block on a
single device. That's only true for the case where you can only go to
one device, blocking at that point would probably be fine. Or if all
your paths are busy, then blocking would also be OK.
Introducing one extra block point will hurt AIO performance, in which
there is usually much less jobs/processes to submit IO.

-- 
Ming
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help