Thread (29 messages) 29 messages, 6 authors, 2018-12-10

Re: [RFC PATCH v1 0/7] Block/XFS: Support alternative mirror device retry

From: Darrick J. Wong <hidden>
Date: 2018-12-10 04:30:15
Also in: linux-fsdevel, linux-xfs, lkml

On Sat, Dec 08, 2018 at 10:49:44PM +0800, Bob Liu wrote:
On 11/28/18 3:45 PM, Christoph Hellwig wrote:
quoted
On Wed, Nov 28, 2018 at 04:33:03PM +1100, Dave Chinner wrote:
quoted
	- how does propagation through stacked layers work?
The only way it works is by each layering driving it.  Thus my
recommendation above bilding on your earlier one to use an index
that is filled by the driver at I/O completion time.

E.g.

	bio_init:		bi_leg = -1

	raid1:			submit bio to lower driver
	raid 1 completion:	set bi_leg to 0 or 1

Now if we want to allow stacking we need to save/restore bi_leg
before submitting to the underlying device.  Which is possible,
but quite a bit of work in the drivers.
I found it's still very challenge while writing the code.
save/restore bi_leg may not enough because the drivers don't know how to do fs-metadata verify.

E.g two layer raid1 stacking

fs:                  md0(copies:2)
                     /          \
layer1/raid1   md1(copies:2)    md2(copies:2)
                  /    \          /     \
layer2/raid1   dev0   dev1      dev2    dev3

Assume dev2 is corrupted
 => md2: don't know how to do fs-metadata verify. 
   => md0: fs verify fail, retry md1(preserve md2).
Then md2 will never be retried even dev3 may also has the right copy.
Unless the upper layer device(md0) can know the amount of copy is 4 instead of 2? 
And need a way to handle the mapping.
Did I miss something? Thanks!
<shrug> It seems reasonable to me that the raid1 layer should set the
number of retries to (number of raid1 mirrors) * min(retry count of all
mirrors) so that the upper layer device (md0) would advertise 4 retry
possibilities instead of 2.

--D

-Bob
quoted
quoted
	- is it generic/abstract enough to be able to work with
	  RAID5/6 to trigger verification/recovery from the parity
	  information in the stripe?
If we get the non -1 bi_leg for paritity raid this is an inidicator
that parity rebuild needs to happen.  For multi-parity setups we could
also use different levels there.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help