Re: kernel oops with blk-mq-sched latest
From: Jens Axboe <axboe@kernel.dk>
Date: 2017-01-18 14:40:06
Also in:
dm-devel
On 01/18/2017 03:48 AM, Hannes Reinecke wrote:
Nearly there. You're missing a 'blk_mq_start_hw_queues(q)' after blk_mq_unfreeze_queue(); without it the queue will stall after switching the scheduler.
Yes indeed, forgot that. Needed after the quiesce.
Also what's quite suspicious is this:
struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
struct request_queue *q)
{
struct blkcg_gq *blkg;
WARN_ON_ONCE(!rcu_read_lock_held());
lockdep_assert_held(q->queue_lock);
/*
* This could be the first entry point of blkcg implementation and
* we shouldn't allow anything to go through for a bypassing queue.
*/
if (unlikely(blk_queue_bypass(q)))
return ERR_PTR(blk_queue_dying(q) ? -ENODEV : -EBUSY);
which now won't work as the respective flags aren't set anymore.
Not sure if that's a problem, though.
But you might want to look at that, too.dying is still used on blk-mq, but yes, the bypass check should now be frozen for blk-mq. Not really directly related to the above change, but it should be fixed up.
Nevertheless, with the mentioned modifications to your patch the crashes don't occur anymore.
Great
Sad news is that it doesn't help _that_ much on spinning rust mpt3sas; there I still see a ~50% performance penalty on reads. Write's slightly better than sq performance, though.
What is the test case? Full details please, from hardware to what you are running. As I've mentioned before, I don't necessarily think your performance issues are related to scheduling. Would be nice to get to the bottom of it, though. And for that, I need more details. -- Jens Axboe