Re: kernel oops with blk-mq-sched latest

From: Jens Axboe <axboe@kernel.dk>
Date: 2017-01-18 14:40:06
Also in: dm-devel

On 01/18/2017 03:48 AM, Hannes Reinecke wrote:

Nearly there.
You're missing a 'blk_mq_start_hw_queues(q)' after
blk_mq_unfreeze_queue(); without it the queue will stall after switching
the scheduler.

Yes indeed, forgot that. Needed after the quiesce.

Also what's quite suspicious is this:

struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
				    struct request_queue *q)
{
	struct blkcg_gq *blkg;

	WARN_ON_ONCE(!rcu_read_lock_held());
	lockdep_assert_held(q->queue_lock);

	/*
	 * This could be the first entry point of blkcg implementation and
	 * we shouldn't allow anything to go through for a bypassing queue.
	 */
	if (unlikely(blk_queue_bypass(q)))
		return ERR_PTR(blk_queue_dying(q) ? -ENODEV : -EBUSY);

which now won't work as the respective flags aren't set anymore.
Not sure if that's a problem, though.
But you might want to look at that, too.

dying is still used on blk-mq, but yes, the bypass check should now be
frozen for blk-mq. Not really directly related to the above change,
but it should be fixed up.

Nevertheless, with the mentioned modifications to your patch the crashes
don't occur anymore.

Great

Sad news is that it doesn't help _that_ much on spinning rust mpt3sas;
there I still see a ~50% performance penalty on reads.
Write's slightly better than sq performance, though.

What is the test case? Full details please, from hardware to what you
are running. As I've mentioned before, I don't necessarily think your
performance issues are related to scheduling. Would be nice to get
to the bottom of it, though. And for that, I need more details.

-- 
Jens Axboe

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help