Re: Oops when completing request on the wrong queue

From: Keith Busch <hidden>
Date: 2016-08-23 22:49:45
Also in: linux-nvme, linux-scsi

Possibly related (same subject, not in this thread)

2016-09-05 · Re: Oops when completing request on the wrong queue · Gabriel Krisman Bertazi <hidden>
2016-08-29 · Re: Oops when completing request on the wrong queue · Jens Axboe <axboe@kernel.dk>
2016-08-29 · Re: Oops when completing request on the wrong queue · Gabriel Krisman Bertazi <hidden>
2016-08-19 · Re: Oops when completing request on the wrong queue · Gabriel Krisman Bertazi <hidden>
2016-08-19 · Re: Oops when completing request on the wrong queue · Jens Axboe <axboe@kernel.dk>

On Tue, Aug 23, 2016 at 03:14:23PM -0600, Jens Axboe wrote:

On 08/23/2016 03:11 PM, Jens Axboe wrote:

quoted

My workload looks similar to yours, in that it's high depth and with a
lot of jobs to keep most CPUs loaded. My bash script is different than
yours, I'll try that and see if it helps here.

Actually, I take that back. You're not using O_DIRECT, hence all your
jobs are running at QD=1, not the 256 specified. That looks odd, but
I'll try, maybe it'll hit something different.

I haven't recreated this either, but I think I can logically see why
this failure is happening.

I sent an nvme driver patch earlier on this thread to exit the hardware
context, which I thought would do the trick if the hctx's tags were
being moved. That turns out to be wrong for a couple reasons.

First, we can't release the nvmeq->tags when a hctx exits because
that nvmeq may be used by other namespaces that need to point to
the device's tag set.

The other reason is that blk-mq doesn't exit or init hardware contexts
when remapping for a CPU event, leaving the nvme driver unaware a hardware
context points to a different tag set.

So I think I see why this test would fail; don't know about a fix yet.
Maybe the nvme driver needs some indirection instead of pointing
directly to the tagset after init_hctx.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help