Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!

kernel BUG at drivers/scsi/scsi_lib.c:1096! · Michael Ellerman <mpe@ellerman.id.au> · 2015-11-18
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Laurent Dufour <hidden> · 2015-11-18
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Michael Ellerman <mpe@ellerman.id.au> · 2015-11-18
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Laurent Dufour <hidden> · 2015-11-18
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Mark Salter <hidden> · 2015-11-18
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Michael Ellerman <mpe@ellerman.id.au> · 2015-11-19
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Christoph Hellwig <hch@infradead.org> · 2015-11-19
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Hannes Reinecke <hare@suse.de> · 2015-11-19
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Ewan Milne <hidden> · 2015-11-20
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Hannes Reinecke <hare@suse.de> · 2015-11-20
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Ewan Milne <hidden> · 2015-11-20
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Hannes Reinecke <hare@suse.de> · 2015-11-23
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Hannes Reinecke <hare@suse.de> · 2015-11-25
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Jens Axboe <axboe@fb.com> · 2015-11-25
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Hannes Reinecke <hare@suse.de> · 2015-11-25
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Jens Axboe <axboe@fb.com> · 2015-11-25
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Mike Snitzer <hidden> · 2015-11-25
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Mike Snitzer <hidden> · 2015-11-25
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Mike Snitzer <hidden> · 2015-11-25
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Hannes Reinecke <hare@suse.de> · 2015-11-25
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Takashi Iwai <hidden> · 2015-12-04
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Jens Axboe <axboe@fb.com> · 2015-12-04
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Takashi Iwai <hidden> · 2015-12-04
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Michael Ellerman <mpe@ellerman.id.au> · 2015-11-20
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Laurent Dufour <hidden> · 2015-11-20
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Mark Salter <hidden> · 2015-11-20
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Laurent Dufour <hidden> · 2015-11-21
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Ming Lei <hidden> · 2015-11-21
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Mark Salter <hidden> · 2015-11-22
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Ming Lei <hidden> · 2015-11-23
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Mark Salter <hidden> · 2015-11-23
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Ming Lei <hidden> · 2015-11-23
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Ming Lei <tom.leiming@gmail.com> · 2015-11-23
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Alan Ott <hidden> · 2015-11-24
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Laurent Dufour <hidden> · 2015-11-23
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Pratyush Anand <hidden> · 2015-11-23
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Laurent Dufour <hidden> · 2015-11-23
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Ming Lei <hidden> · 2015-11-23
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Laurent Dufour <hidden> · 2015-11-23
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096! · Mark Salter <hidden> · 2015-11-24

From: Mike Snitzer <hidden>
Date: 2015-11-25 20:23:21
Also in: linux-scsi, lkml

On Wed, Nov 25 2015 at  2:24pm -0500,
Jens Axboe [off-list ref] wrote:

On 11/25/2015 12:10 PM, Hannes Reinecke wrote:

quoted

On 11/25/2015 06:56 PM, Jens Axboe wrote:

quoted

On 11/25/2015 02:04 AM, Hannes Reinecke wrote:

quoted

On 11/20/2015 04:28 PM, Ewan Milne wrote:

quoted

On Fri, 2015-11-20 at 15:55 +0100, Hannes Reinecke wrote:

quoted

Can't we have a joint effort here?
I've been spending a _LOT_ of time trying to debug things here, but
none of the ideas I've come up with have been able to fix anything.

Yes.  I'm not the one primarily looking at it, and we don't have a
reproducer in-house.  We just have the one dump right now.

quoted

I'm almost tempted to increase the count from scsi_alloc_sgtable()
by one and be done with ...

That might not fix it if it is a problem with the merge code, though.

And indeed, it doesn't.
Seems I finally found the culprit.

What happens is this:
We have two paths, with these seg_boundary_masks:

path-1:    seg_boundary_mask = 65535,
path-2:    seg_boundary_mask = 4294967295,

consequently the DM request queue has this:

md-1:    seg_boundary_mask = 65535,

What happens now is that a request is being formatted, and sent
to path 2. During submission req->nr_phys_segments is formatted
with the limits of path 2, arriving at a count of 3.
Now the request gets retried on path 1, but as the NOMERGE request
flag is set req->nr_phys_segments is never updated.
But blk_rq_map_sg() ignores all counters, and just uses the
bi_vec directly, resulting in a count of 4 -> boom.

So the culprit here is the NOMERGE flag, which is evaluated
via
->dm_dispatch_request()
  ->blk_insert_cloned_request()
    ->blk_rq_check_limits()

If the above assessment is correct, the following patch should
fix it:

diff --git a/block/blk-core.c b/block/blk-core.c
index 801ced7..12cccd6 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c

@@ -1928,7 +1928,7 @@ EXPORT_SYMBOL(submit_bio);
  */
 int blk_rq_check_limits(struct request_queue *q, struct request *rq)
 {
-       if (!rq_mergeable(rq))
+       if (rq->cmd_type != REQ_TYPE_FS)
                return 0;

        if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q,

rq->cmd_flags)) {


Mike? Jens?
Can you comment on it?

We only support merging on REQ_TYPE_FS already, so how is the above
making it any different? In general, NOMERGE being set or not should not
make a difference. It's only a hint that we need not check further if we
should be merging on this request, since we already tried it once, found
we'd exceed various limits, then set NOMERGE to reflect that.

The problem is that NOMERGE does too much, as it inhibits _any_ merging.

Right, that is the point of the flag from the block layer view,
where it was originally added for the case mentioned.

And we really don't want _any_ merging.  The merging, if any, will have
already happened in upper DM-multipath's elevator.  So there should be
no need to have the underlying SCSI paths do any merging.

quoted

Unfortunately, the req->nr_phys_segments value is evaluated in the final
_driver_ context _after_ the merging happend; cf
scsi_lib.c:scsi_init_sgtable().
As nr_phys_segments is inherited from the original request (and never
recalculated with the new request queue limits) the following
blk_rq_map_sg() call might end up at a different calculation, especially
after retrying a request on another path.

That all sounds pretty horrible. Why is blk_rq_check_limits()
checking for mergeable at all? If merging is disabled on the
request, I'm assuming that's an attempt at an optimization since we
know it won't change. But that should be tracked separately, like
how it's done on the bio.

Not clear to me why it was checking for merging...

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help