Thread (4 messages) 4 messages, 3 authors, 2010-01-18

Re: non-fresh data unavailable bug

From: Michael Evans <hidden>
Date: 2010-01-14 19:24:03

On Thu, Jan 14, 2010 at 7:10 AM, Brett Russ [off-list ref] wrote:
Slightly related to my last message here Re:non-fresh behavior, we have seen
cases where the following happens:
* healthy 2 disk raid1 (disks A & B) incurs a problem with disk B
* disk B is removed, unit is now degraded
* replacement disk C is added; recovery from A to C begins
* during recovery, disk A incurs a brief lapse in connectivity.  At this
point C is still up yet only has a partial copy of the data.
* a subsequent assemble operation on the raid1 results in disk A being
kicked out as non-fresh, yet C is allowed in.

This presents quite a data-unavailability problem and basically requires
recognizing the situation and hand assembling the array with disk A (only)
first, then adding C back in.  Unfortunately this situation is hard to
reproduce and we don't have a dump of the 'mdadm --examine' output for it
yet.

Any thoughts on this while we try to get a better reproduction case?

Thanks,
Brett


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
I believe the desired and logical behavior here is to refuse running
an incomplete array unless explicitly forced to do so.  Incremental
assembly might be what you're seeing.

The only way to access the data from those devices, presuming that
without the device that had the hiccup your array is incomplete, would
be to force assembly with the older device included and hope.  I very
much recommend running it read-only until you can determine which
assembly pattern produces the most viable results.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help