Re: [PATCH BUGFIX V2] block, bfq: update wr_busy_queues if needed on a queue split
From: Paolo Valente <hidden>
Date: 2017-06-28 13:44:26
Also in:
lkml
Il giorno 28 giu 2017, alle ore 14:42, Jens Axboe [off-list ref] ha =
scritto:
=20 On 06/27/2017 11:39 PM, Paolo Valente wrote:quoted
=20quoted
Il giorno 27 giu 2017, alle ore 20:29, Jens Axboe [off-list ref] =
ha scritto:
quoted
quoted
=20 On 06/27/2017 12:27 PM, Paolo Valente wrote:quoted
=20quoted
Il giorno 27 giu 2017, alle ore 16:41, Jens Axboe =
[off-list ref] ha scritto:
quoted
quoted
quoted
quoted
=20 On 06/27/2017 12:09 AM, Paolo Valente wrote:quoted
=20quoted
Il giorno 19 giu 2017, alle ore 13:43, Paolo Valente =
[off-list ref] ha scritto:
quoted
quoted
quoted
quoted
quoted
quoted
=20 This commit fixes a bug triggered by a non-trivial sequence of events. These events are briefly described in the next two paragraphs. The impatiens, or those who are familiar with queue merging and splitting, can jump directly to the last paragraph. =20 On each I/O-request arrival for a shared bfq_queue, i.e., for a bfq_queue that is the result of the merge of two or more =
bfq_queues,
quoted
quoted
quoted
quoted
quoted
quoted
BFQ checks whether the shared bfq_queue has become seeky (i.e., =
if too
quoted
quoted
quoted
quoted
quoted
quoted
many random I/O requests have arrived for the bfq_queue; if the =
device
quoted
quoted
quoted
quoted
quoted
quoted
is non rotational, then random requests must be also small for =
the
quoted
quoted
quoted
quoted
quoted
quoted
bfq_queue to be tagged as seeky). If the shared bfq_queue is =
actually
quoted
quoted
quoted
quoted
quoted
quoted
detected as seeky, then a split occurs: the bfq I/O context of =
the
quoted
quoted
quoted
quoted
quoted
quoted
process that has issued the request is redirected from the =
shared
quoted
quoted
quoted
quoted
quoted
quoted
bfq_queue to a new non-shared bfq_queue. As a degenerate case, =
if the
quoted
quoted
quoted
quoted
quoted
quoted
shared bfq_queue actually happens to be shared only by one =
process
quoted
quoted
quoted
quoted
quoted
quoted
(because of previous splits), then no new bfq_queue is created: =
the
quoted
quoted
quoted
quoted
quoted
quoted
state of the shared bfq_queue is just changed from shared to non shared. =20 Regardless of whether a brand new non-shared bfq_queue is =
created, or
quoted
quoted
quoted
quoted
quoted
quoted
the pre-existing shared bfq_queue is just turned into a =
non-shared
quoted
quoted
quoted
quoted
quoted
quoted
bfq_queue, several parameters of the non-shared bfq_queue are =
set
quoted
quoted
quoted
quoted
quoted
quoted
(restored) to the original values they had when the bfq_queue associated with the bfq I/O context of the process (that has =
just
quoted
quoted
quoted
quoted
quoted
quoted
issued an I/O request) was merged with the shared bfq_queue. One =
of
quoted
quoted
quoted
quoted
quoted
quoted
these parameters is the weight-raising state. =20 If, on the split of a shared bfq_queue, 1) a pre-existing shared bfq_queue is turned into a non-shared bfq_queue; 2) the previously shared bfq_queue happens to be busy; 3) the weight-raising state of the previously shared bfq_queue =
happens
quoted
quoted
quoted
quoted
quoted
quoted
to change; the number of weight-raised busy queues changes. The field wr_busy_queues must then be updated accordingly, but such an =
update
quoted
quoted
quoted
quoted
quoted
quoted
was missing. This commit adds the missing update. =20=20 Hi Jens, any idea of the possible fate of this fix?=20 I sort of missed this one. It looks trivial enough for 4.12, or we can defer until 4.13. What do you think? =20=20 It should actually be something trivial, and hopefully correct, because a further throughput improvement (for BFQ), which depends =
on
quoted
quoted
quoted
this fix, is now working properly, and we didn't see any regression =
so
quoted
quoted
quoted
far. In addition, since this improvement is virtually ready for submission, further steps may be probably easier if this fix gets =
in
quoted
quoted
quoted
sooner (whatever the luck of the improvement will be).=20 OK, let's queue it up for 4.13 then. =20=20 My arguments was in favor of 4.12 actually. Maybe you did mean 4.12 here?=20 You were talking about further improvements and new development on top of this, so I assumed you meant 4.13. However, further development is not the main criteria or concern for whether this fix should go into 4.12 or not.
Ok, thanks for your explanation and patience.
The main concern is if this fixes something that is crucial to have in 4.12. It's late in the cycle, I'd rather not push anything that isn't a regression fix at this point. =20
Hard to assess precisely how crucial this is. Certainly it fixes a regression. The practical, negative effects of this regression are systematic when one tries to add the throughput improvement I mentioned: the improvement almost never works. If BFQ is used as it is, then negative effects on throughput are less likely to happen. I hope that this piece of information is somehow useful for your decision. Thanks, Paolo
--=20 Jens Axboe