Thread (5 messages) 5 messages, 2 authors, 2013-08-20

RE: Recovery of RAID1 fails (added disks stays as spare)

From: <hidden>
Date: 2013-08-19 07:48:20

Hey,
-----Original Message-----
From: NeilBrown [mailto:neilb@suse.de]
Sent: Saturday, August 17, 2013 2:51 AM
To: Blaesing, Matthias (KC-IT)
Cc: linux-raid@vger.kernel.org
Subject: Re: Recovery of RAID1 fails (added disks stays as spare)

On Thu, 15 Aug 2013 09:09:40 +0000 [off-list ref] wrote:
quoted
Summary: On one of our servers we suffered from a hard disk error,
that lead to a degraded array.
quoted
The hardware was replaced and the array was rebuild. On one of the
RAID-Sets the newly added
quoted
disk is not activated but stays as spare.
I tried:

* removing /dev/sda3 from the array and add it back
* removing /dev/sda3 from the array, zero the root block and add it
back (--zero-superblock)
quoted
* removing /dev/sda3 from the array, reduce raid devices to one, add
/dev/sda3 back
quoted
* removing /dev/sda3 from the array, zero the first part of the disk
(with dd) and add it back
quoted
I would really appreciate ideas how to fix this (preferably while
running the system).
quoted
Strange.  I would definitely have expected one of those to start the
recovery.
Does anything appear in the kernel logs (e.g. output of 'dmesg')?
Please see the attaches boot.msg - from my perspective it only confirms the results
gained from the /proc/mdstat and mdadm output.
What does
  grep . /sys/block/md3/md/*
show?
grep . /sys/block/md3/md/*
/sys/block/md3/md/array_size:default
/sys/block/md3/md/array_state:clean
grep: /sys/block/md3/md/bitmap_set_bits: Permission denied
/sys/block/md3/md/chunk_size:0
/sys/block/md3/md/component_size:970888192
/sys/block/md3/md/degraded:1
/sys/block/md3/md/layout:0
/sys/block/md3/md/level:raid1
/sys/block/md3/md/max_read_errors:20
/sys/block/md3/md/metadata_version:0.90
/sys/block/md3/md/mismatch_cnt:0
grep: /sys/block/md3/md/new_dev: Permission denied
/sys/block/md3/md/raid_disks:2
/sys/block/md3/md/reshape_direction:forwards
/sys/block/md3/md/reshape_position:none
/sys/block/md3/md/resync_start:none
/sys/block/md3/md/safe_mode_delay:0.204
/sys/block/md3/md/suspend_hi:0
/sys/block/md3/md/suspend_lo:0
/sys/block/md3/md/sync_action:idle
/sys/block/md3/md/sync_completed:none
/sys/block/md3/md/sync_force_parallel:0
/sys/block/md3/md/sync_max:max
/sys/block/md3/md/sync_min:0
/sys/block/md3/md/sync_speed:none
/sys/block/md3/md/sync_speed_max:200000 (system)
/sys/block/md3/md/sync_speed_min:1000 (system)
I don't suppose
  echo recover > /sys/block/md3/md/sync_action
helps?
No - at least it shows now detectable reaction:

No direct output, no output in dmesg, no change in the output in mdadm or /proc/mdstat 
Is there still a kernel thread called
    md3_raid1
running?
Yes:

s15434194:~ # ps ax | grep md3
  786 ?        S     44:14 [md3_raid1]
  836 pts/0    S+     0:00 grep md3
s15434194:~ #


To give some more information - her the mdadm.conf:

s15434194:~ # cat /etc/mdadm.conf
CREATE owner=root group=disk mode=0660 auto=yes

MAILADDR root

DEVICE containers partitions
ARRAY /dev/md1 level=raid1 num-devices=2 devices=/dev/sda1,/dev/sdb1
ARRAY /dev/md3 level=raid1 num-devices=2 devices=/dev/sda3,/dev/sdb3


Could it be that the problem lies in the order in the devices row? When I compare the working md1 and the non-working md3 array I see a difference in the mapping of devices to numbers:

For md1:

   Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1

For md3:

    Number   Major   Minor   RaidDevice State
       0       8       19        0      active sync   /dev/sdb3
       1       0        0        1      removed

       2       8        3        -      spare   /dev/sda3


Would this be worth a try (the disks need approx. 24 hours to reach the rebuild-state, so I'm reluctant to try without some hope) to do:

1. change the Array line in /etc/mdadm.conf for md3 to read (switching order):
ARRAY /dev/md3 level=raid1 num-devices=2 devices=/dev/sdb3,/dev/sda3
2. Remove the spare sda3
3. Zero the superblock of sda3
4. Add sda3 again to the array

?

Mit freundlichen Grüßen

Matthias Bläsing
Abt IT
KompetenzCenter

Tel.: 02351 950-344
Fax.: 02351 950-222
mailto: matthias.blaesing@persona.de
www.persona.de


persona service Verwaltungs AG & Co. KG 
Freisenbergstraße 31 • 58513 Lüdenscheid  
Tel.: (02351) 950-0 • Fax: (02351) 950-222 
Sitz Lüdenscheid • Registergericht Iserlohn, HRA Nr. 2930

persönlich haftende Gesellschafterin: persona service AG
Gartenstraße 93 • CH-4002 Basel
Handelsregister Basel, Nr. CH-270.3.012.836-8
diese vertreten durch den Verwaltungsrat:
Georg Breucker (Präsident) und Dr. Sebastian Burckhardt
www.persona.de

Attachments

  • boot.msg [application/octet-stream] 64786 bytes
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help