Re: [RFC PATCH v3 1/3] blk-mq: Clean up references to old requests when freeing rqs
From: John Garry <hidden>
Date: 2021-03-08 11:21:55
Also in:
lkml
On 06/03/2021 02:52, Khazhy Kumykov wrote:
On Fri, Mar 5, 2021 at 7:20 AM John Garry [off-list ref] wrote:quoted
It has been reported many times that a use-after-free can be intermittently found when iterating busy requests: - https://lore.kernel.org/linux-block/8376443a-ec1b-0cef-8244-ed584b96fa96@huawei.com/ (local) - https://lore.kernel.org/linux-block/5c3ac5af-ed81-11e4-fee3-f92175f14daf@acm.org/T/#m6c1ac11540522716f645d004e2a5a13c9f218908 (local) - https://lore.kernel.org/linux-block/04e2f9e8-79fa-f1cb-ab23-4a15bf3f64cc@kernel.dk/ (local) The issue is that when we switch scheduler or change queue depth, there may be references in the driver tagset to the stale requests. As a solution, clean up any references to those requests in the driver tagset. This is done with a cmpxchg to make safe any race with setting the driver tagset request from another queue.I noticed this crash recently when running blktests on a "debug" config on a 4.15 based kernel (it would always crash), and backporting this change fixes it. (testing on linus's latest tree also confirmed the fix, with the same config). I realize I'm late to the conversation, but appreciate the investigation and fixes :)
Good to know. I'll explicitly cc you on further versions. Thanks, John