On 11/25/21 2:07 PM, Jens Axboe wrote:
On 11/25/21 2:05 PM, Kenneth R. Crudup wrote:
quoted
On Tue, 23 Nov 2021, Jens Axboe wrote:
quoted
It looks like some missed accounting. You can just disable wbt for now, would
be a useful data point to see if that fixes it. Just do:
quoted
echo 0 > /sys/block/nvme0n1/queue/wbt_lat_usec
quoted
and that will disable writeback throttling on that device.
It's been about 48 hours and haven't seen the issue since doing this.
Great, thanks for verifying. From your report 5.16-rc2 has the issue, is
5.15 fine?
Can you apply this on top of 5.16-rc2 or current -git and see if it fixes
it for you?
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8799fa73ef34..8874a63ae952 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -860,13 +860,14 @@ void blk_mq_end_request_batch(struct io_comp_batch *iob)
if (iob->need_ts)
__blk_mq_end_request_acct(rq, now);
+ rq_qos_done(rq->q, rq);
+
WRITE_ONCE(rq->state, MQ_RQ_IDLE);
if (!refcount_dec_and_test(&rq->ref))
continue;
blk_crypto_free_request(rq);
blk_pm_mark_last_busy(rq);
- rq_qos_done(rq->q, rq);
if (nr_tags == TAG_COMP_BATCH || cur_hctx != rq->mq_hctx) {
if (cur_hctx)
--
Jens Axboe