Re: v4.20-rc6: Sporadic use-after-free in bt_iter()
From: Jens Axboe <axboe@kernel.dk>
Date: 2018-12-20 04:19:46
On 12/19/18 8:24 PM, jianchao.wang wrote:
On 12/20/18 11:17 AM, Jens Axboe wrote:quoted
On 12/19/18 5:16 PM, Bart Van Assche wrote:quoted
On Wed, 2018-12-19 at 16:27 -0700, Jens Axboe wrote:quoted
On 12/19/18 4:24 PM, Bart Van Assche wrote:quoted
Hello, If I run the srp blktests in a loop then I see the below call stack appearing sporadically. I have not yet had the time to analyze this but I'm reporting this here in case someone else would already have had a look at this. Bart. ================================================================== BUG: KASAN: use-after-free in bt_iter+0x86/0xf0 Read of size 8 at addr ffff88803b335240 by task fio/21412 CPU: 0 PID: 21412 Comm: fio Tainted: G W 4.20.0-rc6-dbg+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 Call Trace: dump_stack+0x86/0xca print_address_description+0x71/0x239 kasan_report.cold.5+0x242/0x301 __asan_load8+0x54/0x90 bt_iter+0x86/0xf0 blk_mq_queue_tag_busy_iter+0x373/0x5e0 blk_mq_in_flight+0x96/0xb0 part_in_flight+0x40/0x140 part_round_stats+0x18e/0x370 blk_account_io_start+0x3d7/0x670 blk_mq_bio_to_request+0x19c/0x3a0 blk_mq_make_request+0x7a9/0xcb0 generic_make_request+0x41d/0x960 submit_bio+0x9b/0x250 do_blockdev_direct_IO+0x435c/0x4c70 __blockdev_direct_IO+0x79/0x88 ext4_direct_IO+0x46c/0xc00 generic_file_direct_write+0x119/0x210 __generic_file_write_iter+0x11c/0x280 ext4_file_write_iter+0x1b8/0x6f0 aio_write+0x204/0x310 io_submit_one+0x9d3/0xe80 __x64_sys_io_submit+0x115/0x340 do_syscall_64+0x71/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f02cf043219I've seen this one before as well, it's not a new thing. As far as I can tell, it's a false positive. There should be no possibility for a use-after-free iterating the static tags/requests.Are you sure this is a false positive?No I'm not, but the few times I have seen it, I haven't been able to make much sense of it. It goes back quite a bit. I have not yet encountered any falsequoted
positive KASAN complaints. According to the following gdb output this complaint refers to reading rq->q: (gdb) list *(bt_iter+0x86) 0xffffffff816b9346 is in bt_iter (block/blk-mq-tag.c:237). 232 233 /* 234 * We can hit rq == NULL here, because the tagging functions 235 * test and set the bit before assigning ->rqs[]. 236 */ 237 if (rq && rq->q == hctx->queue) 238 iter_data->fn(hctx, rq, iter_data->data, reserved); 239 return true; 240 } 241 From the disassembly output: 232 233 /* 234 * We can hit rq == NULL here, because the tagging functions 235 * test and set the bit before assigning ->rqs[]. 236 */ 237 if (rq && rq->q == hctx->queue) 0xffffffff816b9339 <+121>: test %r12,%r12 0xffffffff816b933c <+124>: je 0xffffffff816b935f <bt_iter+159> 0xffffffff816b933e <+126>: mov %r12,%rdi 0xffffffff816b9341 <+129>: callq 0xffffffff813bd3e0 <__asan_load8> 0xffffffff816b9346 <+134>: lea 0x138(%r13),%rdi 0xffffffff816b934d <+141>: mov (%r12),%r14 0xffffffff816b9351 <+145>: callq 0xffffffff813bd3e0 <__asan_load8> 0xffffffff816b9356 <+150>: cmp 0x138(%r13),%r14 0xffffffff816b935d <+157>: je 0xffffffff816b936f <bt_iter+175> BTW, rq may but does not have to refer to tags->static_rqs[...]. It may also refer to hctx->fq.flush_rq.But even those are persistent for the lifetime of the queue... But since kasan complains it belongs to a specific page, I'm guessing it's one of the regular requests since those are out of a chopped up page. Which means it makes even less sense. Is this happening while devices are being actively torn down? And are you using shared tags? That's the only way I could see this triggering.Or could it be caused by the stale request in hctx->tags->rqs[] slot ? We don't clear it after free the requests. And there could be a scenario like, There used to be a io scheduler attached. After some workload, the io scheduler is detached. So there could be rqs allocated by the io scheduler left in hctx->tags->rqs. blk_mq_get_request blk_mq_queue_tag_busy_iter -> blk_mq_get_tag -> bt_for_each -> bt_iter -> rq = taags->rqs[] -> rq->q -> blk_mq_rq_ctx_init -> data->hctx->tags->rqs[rq->tag] = rq; If the scenario is possible, maybe we could fix it as following.
Ah yes, good point, I bet that's what it is. But we just had this exact discussion in another thread, and my point there was that we should clear these when they go away, not inline. So how about clearing entries when the sched tags go away? -- Jens Axboe