Thread (1 message) 1 message, 1 author, 2016-09-05

Re: Oops when completing request on the wrong queue

From: Gabriel Krisman Bertazi <hidden>
Date: 2016-09-05 12:02:56
Also in: linux-nvme

Possibly related (same subject, not in this thread)

Jens Axboe [off-list ref] writes:
On 08/29/2016 12:06 PM, Gabriel Krisman Bertazi wrote:
quoted
Jens Axboe [off-list ref] writes:
quoted
quoted
Can you try this patch? It's not perfect, but I'll be interested if it
makes a difference for you.
Hi Jens,

Sorry for the delay.  I just got back to this and have been running your
patch on top of 4.8 without a crash for over 1 hour.  I wanna give it
more time to make sure it's running properly, though.

Let me get back to you after a few more rounds of test.
Thanks, sounds good. The patches have landed in mainline too.
Hi Jens,

Our test teams ran stress tests on several machines over the last week
on a test kernel with your patches applied, and were no longer able to
reproduce the issue.

Thanks a lot for helping out on this one.
quoted
quoted
This one should handle the WARN_ON() for running the hw queue on the
wrong CPU as well.
On the workaround you added to prevent WARN_ON, we surely need to
prevent blk_mq_hctx_next_cpu from scheduling dead cpus in the first
place, right..  How do you feel about the following RFC?  I know it's
not a complete fix, but it feels like a good improvement to me.

http://www.spinics.net/lists/linux-scsi/msg98608.html
But we can't completely prevent it, and I don't think we have to. I just
don't want to trigger a warning for something that's a valid condition.
I want the warning to trigger if this happens without the CPU going
offline, since then it's indicative of a real bug in the mapping. Your
patch isn't going to prevent it either - it'll shrink the window, at the
expense of making blk_mq_hctx_next_cpu() more expensive. So I don't
think it's worthwhile.
Right, I got your point.  Your patch definitely prevents the WARN_ON
from occurring on CPU hotplug events too.  So thanks a lot for help on
that too :)

-- 
Gabriel Krisman Bertazi


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help