Thread (3 messages) 3 messages, 2 authors, 2021-06-15

Re: [PATCH] blk-mq: Do not lookup ctx with invalid index

From: Ming Lei <hidden>
Date: 2021-06-15 03:37:55
Also in: lkml

On Mon, Jun 14, 2021 at 01:37:06PM +0200, Daniel Wagner wrote:
On Tue, Jun 08, 2021 at 08:33:39PM +0200, Daniel Wagner wrote:
quoted
cpumask_first_and() returns >= nr_cpu_ids if the two provided masks do
not share a common bit. Verify we get a valid value back from
cpumask_first_and().
So I got feedback on this issue (but not on the patch itself yet). The
system starts with 16 virtual CPU cores and during the test 4 cores are
removed[1] and as soon there is an error on the storage side, the reset
code on the host ends up in this path and crashes. I still don't
understand why the CPU removal is not updating the CPU mask correctly
before we hit the reset path. I'll continue to investigate.
We don't update hctx->cpumask when CPU is added/removed, and that is
assigned against cpu_possible_mask from beginning.

It is one long-term issue, which can be triggered when all cpus in
hctx->cpumask become offline. The thing is that only nvmf_connect_io_queue()
allocates request via specified hctx.

thanks,
Ming
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help