Thread (2 messages) 2 messages, 2 authors, 2003-09-27
STALE8315d

[PATCH] fixes defect with kernel BUG using multipath on 2.6.0-test5

From: Steven Dake <hidden>
Date: 2003-09-27 00:34:42
Also in: linux-scsi, lkml

Folks,
Thanks Matt and Jens for the debug help on the multipath problem.  I now
have a patch (attached) which solves the problem and makes multipath
work properly.   There are two types of "flags" that are used in a block
io request, bi_flags, and bi_rw.  bi_flags is used for flags to the
block level code, and bi_rw is used for flags to the low level device
drivers.  The code in the multipath driver used the wrong flag in the
wrong field.  In this case, the flag FASTFAIL (value 3) was being set to
the bi_flags field.  FASTFAIL is a hint to the low level driver that it
should try to fail out quickly.  Unfortunately, the value 3 is also
BIO_SEG_VALID, which is a flag to the block subsystem that the segments
shouldn't be recalculated.  The result was that the wrong field was set,
telling the block layer not to recalculate the segments resulting in
phys and hw segments of 0.  Not good.

Neil can you send upstream ?

Thanks
-steve

On Fri, 2003-09-26 at 13:14, Steven Dake wrote:
On Fri, 2003-09-26 at 05:26, Jens Axboe wrote: 
quoted
On Fri, Sep 26 2003, Matthew Wilcox wrote:
quoted
On Thu, Sep 25, 2003 at 06:57:15PM -0700, Steven Dake wrote:
quoted
kernel BUG at drivers/scsi/scsi_lib.c:544!
        BUG_ON(!cmd->use_sg);
quoted
 [<c01f631d>] scsi_init_io+0x7a/0x13d
static int scsi_init_io(struct scsi_cmnd *cmd)
        struct request     *req = cmd->request;
        cmd->use_sg = req->nr_phys_segments;
        sgpnt = scsi_alloc_sgtable(cmd, GFP_ATOMIC);
quoted
 [<c01f6455>] scsi_prep_fn+0x75/0x171
static int scsi_prep_fn(struct request_queue *q, struct request *req)
        struct scsi_cmnd *cmd;
        cmd->request = req;
        ret = scsi_init_io(cmd);

.. this is getting outside my area of confidence.  Ask axboe why we might
get a zero nr_phys_segments request passed in.
Looks like an mp bug. I'd suggest adding something ala

	if (!rq->nr_phys_segments || !rq->nr_hw_segments) {
		blk_dump_rq_flags(req, "scsi_init_io");
		return BLKPREP_KILL;
	}

inside the first

	} else if (req->flags & (REQ_CMD | REQ_BLOCK_PC)) {

drivers/scsi/scsi_lib.c:scsi_prep_fn(). That will show the state of such
a buggy request. I'm pretty sure this is an mp bug though.
scsi_prep_fn: dev sdd: flags = REQ_CMD REQ_STARTED
sector 0, nr/cnr 8/8
bio c2694708, biotail c2694708, buffer f76dc000, data 00000000, len 0
multipath: IO failure on sdd, disabling IO path.
        Operation continuing on 1 IO paths.
multipath: sdd: rescheduling sector 8
MULTIPATH conf printout:
-- wd:1 rd:2
disk0, o:0, dev:sdd
disk1, o:1, dev:sdb
MULTIPATH conf printout:
-- wd:1 rd:2
disk1, o:1, dev:sdb
multipath: sdd: redirecting sector 0 to another IO path
scsi_prep_fn: dev sdb: flags = REQ_CMD REQ_STARTED
sector 0, nr/cnr 0/8
bio c2694708, biotail c2694708, buffer f76dc000, data 00000000, len 0

I assume a length of zero is wrong...  I'll trace up the stack and see
where the bad data gets into the request.

Thanks for the pointer...
-steve  

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help