Thread (2 messages) 2 messages, 2 authors, 2007-10-18

Re: kicking non-fresh member from array?

From: Mike Snitzer <hidden>
Date: 2007-10-18 19:04:27

On 10/18/07, Goswin von Brederlow [off-list ref] wrote:
"Mike Snitzer" [off-list ref] writes:
quoted
All,

I have repeatedly seen that when a 2 member raid1 becomes degraded,
and IO continues to the lone good member, that if the array is then
stopped and reassembled you get:

md: bind<nbd0>
md: bind<sdc>
md: kicking non-fresh nbd0 from array!
md: unbind<nbd0>
md: export_rdev(nbd0)
raid1: raid set md0 active with 1 out of 2 mirrors

I'm not seeing how one can avoid assembling such an array in 2 passes:
1) assemble array with both members
2) if a member was deemed "non-fresh" re-add that member; whereby
triggering recovery.

So why does MD kick non-fresh members out on assemble when its
perfectly capable of recovering the "non-fresh" member?  Looking at
md.c it is fairly clear there isn't a way to avoid this 2-step
procedure.

Why/how does MD benefit from this "kicking non-fresh" semantic?
Should MD/mdadm be made optionally tolerant of such non-fresh members
during assembly?

Mike
What if the disk has lots of bad blocks, just not where the meta data
is? On every restart you would resync and fail.

Or what if you removed a mirror to keep a snapshot of a previous
state? If it auto resyncs you loose that snapshot.
Both of your examples are fairly tenuous given that such members
shouldn't have been provided on the --asemble commandline.  I'm not
talking about auto assemble via udev or something.  But auto assemble
via udev brings up an annoying corner-case when you consider the 2
cases you pointed out.

So you have valid points.  This leads to my last question; having the
ability to _optionally_ tolerate (repair) such stale members would
allow for greater flexibility.  The current behavior isn't conducive
to repairing unprotected raids (that mdadm/md were told to assemble
with specific members) without taking steps to say "no I really
_really_ mean it; now re-add this disk!".

Any pointers from Neil (or others) on how such a 'repair "non-fresh"
member(s) on assemble' override _should_ be implemented would be
helpful.  My first thought is to add a new superblock
--update=repair-non-fresh option to mdadm that would tie into a new
flag in the MD superblock.  But then it begs the question: why not
first add support to set such a superblock option at MD create-time?
The validate_super methods would also need to be trained accordingly.

regards,
Mike
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help