Re: Recovery of RAID1 fails (added disks stays as spare)
From: NeilBrown <hidden>
Date: 2013-08-17 00:50:37
On Thu, 15 Aug 2013 09:09:40 +0000 [off-list ref] wrote:
Hello,
I'm currently fighting a server problem and have the feeling, that I'm running into walls.
Summary: On one of our servers we suffered from a hard disk error, that lead to a degraded array.
The hardware was replaced and the array was rebuild. On one of the RAID-Sets the newly added
disk is not activated but stays as spare.
System: SUSE Linux Enterprise Server 11 (x86_64) 11.2
The current state:
# cat /proc/mdstat
Personalities : [raid1] [raid0] [raid10] [raid6] [raid5] [raid4]
md3 : active raid1 sda3[2](S) sdb3[0]
970888192 blocks [2/1] [U_]
md1 : active raid1 sda1[0] sdb1[1]
3911680 blocks [2/2] [UU]
unused devices: <none>
# mdadm --detail /dev/md3
/dev/md3:
Version : 0.90
Creation Time : Fri Feb 4 11:47:04 2011
Raid Level : raid1
Array Size : 970888192 (925.91 GiB 994.19 GB)
Used Dev Size : 970888192 (925.91 GiB 994.19 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 3
Persistence : Superblock is persistent
Update Time : Thu Aug 15 10:22:07 2013
State : clean, degraded
Active Devices : 1
Working Devices : 2
Failed Devices : 0
Spare Devices : 1
UUID : e9d9c5f5:615c789e:3fb6082e:e5593158
Events : 0.18857541
Number Major Minor RaidDevice State
0 8 19 0 active sync /dev/sdb3
1 0 0 1 removed
2 8 3 - spare /dev/sda3
I would expect the raid system to move /dev/sda3 to number 1 and mark it as active.
Versions:
# uname -a
Linux 3.0.58-0.6.6-default #1 SMP Tue Feb 19 11:07:00 UTC 2013 (1576ecd) x86_64 x86_64 x86_64 GNU/Linux
# mdadm -V
mdadm - v3.2.2 - 17th June 2011
I tried:
* removing /dev/sda3 from the array and add it back
* removing /dev/sda3 from the array, zero the root block and add it back (--zero-superblock)
* removing /dev/sda3 from the array, reduce raid devices to one, add /dev/sda3 back
* removing /dev/sda3 from the array, zero the first part of the disk (with dd) and add it back
I would really appreciate ideas how to fix this (preferably while running the system).
Strange. I would definitely have expected one of those to start the recovery.
Does anything appear in the kernel logs (e.g. output of 'dmesg')?
What does
grep . /sys/block/md3/md/*
show?
I don't suppose
echo recover > /sys/block/md3/md/sync_action
helps?
Is there still a kernel thread called
md3_raid1
running?
NeilBrown Attachments
- signature.asc [application/pgp-signature] 828 bytes