Thread (2 messages) 2 messages, 2 authors, 2014-01-14

Re: After RAID0 grow: inconsistent superblocks and /proc/mdstat

From: Richard Michael <hidden>
Date: 2014-01-14 17:09:49

On Tue, Jan 14, 2014 at 1:11 AM, NeilBrown [off-list ref] wrote:
On Mon, 13 Jan 2014 00:19:28 -0500 Richard Michael
[off-list ref] wrote:
quoted
Neil,

Thank you for the quick reply.

I have a few followup questions and comments, inlined below.
I assume it was by mistake that you didn't copy the list on this follow
and I've taken the liberty of copying the list for this reply.
Yes (typical reply v. reply-all) ; thank you.
quoted
On Mon, Jan 13, 2014 at 12:03 AM, NeilBrown [off-list ref] wrote:
quoted
On Sun, 12 Jan 2014 23:37:57 -0500 Richard Michael
[off-list ref] wrote:
quoted
Hello list,

I grew a RAID0 by one-disk, and it re-shaped via RAID4 as expected.

However, the component superblocks still RAID4, while /proc/mdstat,
/sys/block/md0/md/level and "mdadm -D" all indicate RAID0.

I am reluctant to stop the array, in case auto-assemble can't put it
back together.  (I suppose I could create a new array, but I'd want to
be quite confident about the layout of the disks.)


Is this a bug?  Should/can I re-write the superblock(s)?


# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4]
md0 : active raid0 sdc1[2] sdd1[0]
      5860268032 blocks super 1.2 512k chunks

# cat /sys/block/md0/md/level
raid0

# mdadm -D /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Fri Jan 10 13:02:25 2014
     Raid Level : raid0
     Array Size : 5860268032 (5588.79 GiB 6000.91 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Sun Jan 12 20:08:53 2014
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 512K

    Number   Major   Minor   RaidDevice State
       0       8       49        0      active sync   /dev/sdd1
       2       8       33        1      active sync   /dev/sdc1



But,


# mdadm -E /dev/sd[cd]1
/dev/sdc1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 8f51352a:610d0ecd:a1e28ddd:86c8586c
           Name : anvil.localdomain:0  (local to host anvil.localdomain)
  Creation Time : Fri Jan 10 13:02:25 2014
     Raid Level : raid4
   Raid Devices : 3

 Avail Dev Size : 5860268943 (2794.39 GiB 3000.46 GB)
     Array Size : 5860268032 (5588.79 GiB 6000.91 GB)
  Used Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
    Data Offset : 260096 sectors
   Super Offset : 8 sectors
   Unused Space : before=260008 sectors, after=2959 sectors
          State : clean
    Device UUID : ad6e6c88:0f897bc1:1f6ec909:f599bc01

    Update Time : Sun Jan 12 20:08:53 2014
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 1388a7b - correct
         Events : 9451

     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 8f51352a:610d0ecd:a1e28ddd:86c8586c
           Name : anvil.localdomain:0  (local to host anvil.localdomain)
  Creation Time : Fri Jan 10 13:02:25 2014
     Raid Level : raid4
   Raid Devices : 3

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
     Array Size : 5860268032 (5588.79 GiB 6000.91 GB)
    Data Offset : 260096 sectors
   Super Offset : 8 sectors
   Unused Space : before=260008 sectors, after=2959 sectors
          State : clean
    Device UUID : b3cda274:547919b1:4e026228:0a4981e7

    Update Time : Sun Jan 12 20:08:53 2014
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : e16a1979 - correct
         Events : 9451

     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AA. ('A' == active, '.' == missing, 'R' == replacing)



Somewhat aside, I grew the array with:

"mdadm --grow /dev/md0 --raid-devices=2 --add /dev/sdc1"
That is the correct command.
quoted
I suspect I should not have used "--add".  Looking at the superblock,
there is a 3rd unknown device, which I did not intend to add.

Did I convince mdadm to add two devices at the same time, sdc1 *and* a
missing device?  (This surprises me a bit, in the sense that
--raid-devices=2 would pertain to the added devices, rather than the
total devices in the array.)

Or, perhaps mdadm add a "dummy" device as part of the temporary RAID4
conversion?
Exactly.  The RAID4 had 1 more device than the RAID0.  What is what you are
seeing.

I'm a bit confused ... did you grow this from a 1-device RAID0 to a 2-device
RAID0?  That seems like an odd thing to do, but it should certainly work.
Yes.  I'm disk/data juggling.  I will copy the data from a third 3TB
into the new 2-disk 6TB RAID0, then convert it to RAID5 re-using the
third disk for parity.  (Perhaps there's a method with fewer hoops to
hop through.)
Seems not-unreasonable.
quoted
quoted
This should work and I think I've tested it.  However looking at the code I
cannot see how it ever would have done.  I cannot see anything that would
write out the new metadata to the RAID0 after the reshape completes.
Normally md will never write to the metadata of a RAID0 so it would need
special handling which doesn't seem to be there.
"never write to the metadata of a RAID0":  is this why there is no
Name, UUID or Events stanza in the "mdadm -D /dev/md0" output?
No.. That's just because the level recorded in the metadata is different from
the level that md thinks the array is.  mdadm detects this inconsistency and
decides not to trust the metadata.
Might be informative to include a comment in the mdadm -D output to
that effect.  (Although in this specific case, I gather once you've
fixed the RAID0/metadata write-out, there will no longer be this
inconsistency and therefore the stanza would have been present.)
quoted
quoted
I just tried testing it on the current mainline kernel and it crashes  :-(

So it looks like I need to do some fixing here.

Your array should continue to work.  If you reboot, it will be assembled as a
RAID4 with the parity disk missing.   This will work perfectly but may not be
as fast as RAID0.  You can "mdadm --grow /dev/md0 --level=0" to convert it to
RAID0 though it probably won't cause the metadata to be updated.
How can I update the superblock?
I look at the code some more and experimented and if you simply stop the
array the metadata will be written out.  So after stopping the array it will
appear to be RAID0.
This worked, thank you.  (Stopped and re-assembled without problem.)
quoted
As I mentioned, the next step is convert to RAID5.  Will the RAID4
superblock confuse [in fact ] RAID0 to RAID5 re-shape?
Shouldn't do.  But if you can stop and restart the array to get the metadata
updated, that would be safer.
Currently re-shaping to RAID5, no problems encountered.  (I notice in
the case of RAID0 to RAID5 re-shape, all the metadata has been updated
during the re-shape; mdadm -D/-E now report RAID5 with the spare
rebuilding.)

Thanks again for the reply.

Regards,
Richard

quoted
quoted
Thanks for the report.
You're most welcome ; thank you!

Regards,
Richard
quoted
NeilBrown
NeilBrown
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help