Thread (10 messages) 10 messages, 4 authors, 2019-08-21

Re: 5.3-rc1 regression with XFS log recovery

From: Dave Chinner <david@fromorbit.com>
Date: 2019-08-20 21:45:20
Also in: linux-xfs

On Tue, Aug 20, 2019 at 05:24:25PM +0800, Ming Lei wrote:
quoted hunk ↗ jump to hunk
On Tue, Aug 20, 2019 at 04:13:26PM +0800, Ming Lei wrote:
quoted
On Tue, Aug 20, 2019 at 07:53:20AM +0200, hch@lst.de wrote:
quoted
On Tue, Aug 20, 2019 at 02:41:35PM +1000, Dave Chinner wrote:
quoted
quoted
With the following debug patch.  Based on that I think I'll just
formally submit the vmalloc switch as we're at -rc5, and then we
can restart the unaligned slub allocation drama..
This still doesn't make sense to me, because the pmem and brd code
have no aligment limitations in their make_request code - they can
handle byte adressing and should not have any problem at all with
8 byte aligned memory in bios.

Digging a little furhter, I note that both brd and pmem use
identical mechanisms to marshall data in and out of bios, so they
are likely to have the same issue.

So, brd_make_request() does:

        bio_for_each_segment(bvec, bio, iter) {
                unsigned int len = bvec.bv_len;
                int err;

                err = brd_do_bvec(brd, bvec.bv_page, len, bvec.bv_offset,
                                  bio_op(bio), sector);
                if (err)
                        goto io_error;
                sector += len >> SECTOR_SHIFT;
        }

So, the code behind bio_for_each_segment() splits multi-page bvecs
into individual pages, which are passed to brd_do_bvec(). An
unaligned 4kB io traces out as:

 [  121.295550] p,o,l,s 00000000a77f0146,768,3328,0x7d0048
 [  121.297635] p,o,l,s 000000006ceca91e,0,768,0x7d004e

i.e. page		offset	len	sector
00000000a77f0146	768	3328	0x7d0048
000000006ceca91e	0	768	0x7d004e

You should be able to guess what the problems are from this.
The problem should be that offset of '768' is passed to bio_add_page().
It can be quite hard to deal with non-512 aligned sector buffer, since
one sector buffer may cross two pages, so far one workaround I thought
of is to not merge such IO buffer into one bvec.

Verma, could you try the following patch?
diff --git a/block/bio.c b/block/bio.c
index 24a496f5d2e2..49deab2ac8c4 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -769,6 +769,9 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page,
 	if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
 		return false;
 
+	if (off & 511)
+		return false;
What does this acheive? It only prevents the unaligned segment from
being merged, it doesn't prevent it from being added to a new bvec.

However, the case here is that:
quoted
quoted
quoted
i.e. page		offset	len	sector
00000000a77f0146	768	3328	0x7d0048
000000006ceca91e	0	768	0x7d004e
The second page added to the bvec is actually offset alignedr. Hence
the check would do nothing on the first page because the bvec array
is empty (so goes into a new bvec anyway), and the check on the
second page would do nothing an it would merge with first because
the offset is aligned correctly. In both cases, the length of the
segment is not aligned, so that needs to be checked, too.

IOWs, I think the check needs to be in bio_add_page, it needs to
check both the offset and length for alignment, and it needs to grab
the alignment from queue_dma_alignment(), not use a hard coded value
of 511.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help