Thread (6 messages) 6 messages, 4 authors, 2016-10-24

Re: How to fix mistake on raid: mdadm create instead of assemble?

From: Shaohua Li <shli@kernel.org>
Date: 2016-10-21 22:35:24

On Fri, Oct 21, 2016 at 10:45:10AM +0200, Santiago DIEZ wrote:
Hi,

Thanks Andreas,

Yes apparently, 3/4 of the original disks seem to be safe. But I'm
terrified at the idea of doing something wrong assembling them.
Incidentally, I indeed did a mistake trying to assemble the ddrescue
images of the 3 safe disks. I tried to create again with proper
metadata and chunck but it did not work. I'm still scared at the idea
of restarting the original raid. I'm currently ddrescuing again the 3
partitions to then try and *assemble* them rather than *create*.


Thanks Wol,

I use loop devices because I work on partition images, not on actual partitions:
I use ddrescue to copy data from /dev/sd[abc]10 to
some.other.server:/home/sd[abc].img
Then I go to some.other.server and turn the images into loop devices :
losetup /dev/loop0 /home/sda10.img
losetup /dev/loop1 /home/sdb10.img
losetup /dev/loop2 /home/sdc10.img
Then I tried to created the raid, it worked but as I said, the
filesystem was unreadable.
I know the idea of using loop devices works because I tested it before.
I'm doing the whole procedure all over again (takes 5 days to ddrescue
the 3 partitions to another server) and then I will use the command
you recommended :
mdadm --assemble /dev/md0 /dev/loop0 /dev/loop1 /dev/loop2 --force


Will keep you posted

-------------------------
Santiago DIEZ
Quark Systems & CAOBA
23 rue du Buisson Saint-Louis, 75010 Paris
-------------------------

On Mon, Oct 10, 2016 at 12:39 AM, Wols Lists [off-list ref] wrote:
quoted
On 08/10/16 13:30, Andreas Klauer wrote:
quoted
On Fri, Oct 07, 2016 at 05:37:32PM +0200, Santiago DIEZ wrote:
quoted
quoted
First thing I did is ddrescue the remaining partitions sd[abc]10 .
ddrescue did not stumble into any read error so I assume all remaining
partitions are perfectly safe.
So ... don't you still have a good copy?

You only killed one of them, right? Did not make same mistake twice?
quoted
quoted
There comes my mistake: I ran the --create command instead of --assemble :

================================================================================
# mdadm --create --verbose /dev/md1 --raid-devices=4 --level=raid5
--run --readonly /dev/loop0 /dev/loop1 /dev/loop2 missing
One oddity I've noticed. You've created the array using loop devices.
What are these?

The reason I ask is that using loopback devices is a standard technique
for rebuilding a damaged array, specifically to prevent md from actually
writing to the drive. So is it possible that "mdadm --create" only wrote
to ram, and a reboot will recover your ddrescue copies untouched?

My raid-fu isn't enough to tell me whether I'm right or not ... :-)

If necessary you'll have to do another ddrescue from the original
drives, and you should then be able to assemble the array from the
copies. Don't use "missing", use "--force" and you should get a working,
degraded, array to which you can add a new drive and rebuild the array.

mdadm --assemble /dev/md0 /dev/sd[efg]10 --force

if I'm right ... so long as it's the copies, you can always recover
again from the original disks, and if there's a problem with the copies
mdadm should complain when it assembles the array.
Hmm, those commands work for me. I'm adding Song and Jes if they have ideas.

Thanks,
Shaohua
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help