Re: Array size dropped from 40TB to 7TB when upgrading to 5.10

From: Song Liu <song@kernel.org>
Date: 2020-12-16 00:03:59

On Tue, Dec 15, 2020 at 10:40 AM Sébastien Luttringer [off-list ref] wrote:

Hello,

After a clean reboot to the new kernel 5.10.0 my 40TB md raid5 array size
droped to 7TB.
The previous kernel was 5.9.5. Rebooting back to the 5.9.5 didn't fix the
issue.

# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sdf[9] sdd[10] sda[7] sdb[6] sdc[11] sde[8]
      6857871360 blocks super 1.2 level 5, 512k chunk, algorithm 2 [6/6]
[UUUUUU]

unused devices: <none>


journalctl -oshort-iso --no-hostname -b -6|grep md0
2020-12-04T02:30:47+0100 kernel: md/raid:md0: device sdf operational as raid
disk 0
2020-12-04T02:30:47+0100 kernel: md/raid:md0: device sda operational as raid
disk 5
2020-12-04T02:30:47+0100 kernel: md/raid:md0: device sdd operational as raid
disk 4
2020-12-04T02:30:47+0100 kernel: md/raid:md0: device sde operational as raid
disk 2
2020-12-04T02:30:47+0100 kernel: md/raid:md0: device sdc operational as raid
disk 1
2020-12-04T02:30:47+0100 kernel: md/raid:md0: device sdb operational as raid
disk 3
2020-12-04T02:30:47+0100 kernel: md/raid:md0: raid level 5 active with 6 out of
6 devices, algorithm 2
2020-12-04T02:30:47+0100 kernel: md0: detected capacity change from 0 to
40007809105920
2020-12-04T02:31:47+0100 kernel: EXT4-fs (md0): mounted filesystem with ordered
data mode. Opts: (null)

# journalctl -oshort-iso --no-hostname -b -5|grep md0
2020-12-15T03:53:00+0100 kernel: md/raid:md0: device sdf operational as raid
disk 0
2020-12-15T03:53:00+0100 kernel: md/raid:md0: device sda operational as raid
disk 5
2020-12-15T03:53:00+0100 kernel: md/raid:md0: device sde operational as raid
disk 2
2020-12-15T03:53:00+0100 kernel: md/raid:md0: device sdd operational as raid
disk 4
2020-12-15T03:53:00+0100 kernel: md/raid:md0: device sdc operational as raid
disk 1
2020-12-15T03:53:00+0100 kernel: md/raid:md0: device sdb operational as raid
disk 3
2020-12-15T03:53:00+0100 kernel: md/raid:md0: raid level 5 active with 6 out of
6 devices, algorithm 2
2020-12-15T03:53:00+0100 kernel: md0: detected capacity change from 0 to
7022460272640
2020-12-15T03:54:20+0100 systemd-fsck[1009]: fsck.ext4: Invalid argument while
trying to open /dev/md0

There is no log of hardware errors or unclean unmounting.

# mdadm -D /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Mon Jan 24 02:53:21 2011
        Raid Level : raid5
        Array Size : 6857871360 (6540.18 GiB 7022.46 GB)
     Used Dev Size : 1371574272 (1308.04 GiB 1404.49 GB)
      Raid Devices : 6
     Total Devices : 6
       Persistence : Superblock is persistent

       Update Time : Tue Dec 15 17:53:13 2020
             State : clean
    Active Devices : 6
   Working Devices : 6
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : resync

              Name : white:0  (local to host white)
              UUID : affd87df:da503e3b:52a8b97f:77b80c0c
            Events : 1791763

    Number   Major   Minor   RaidDevice State
       9       8       80        0      active sync   /dev/sdf
      11       8       32        1      active sync   /dev/sdc
       8       8       64        2      active sync   /dev/sde
       6       8       16        3      active sync   /dev/sdb
      10       8       48        4      active sync   /dev/sdd
       7       8        0        5      active sync   /dev/sda

The mdadm userspace as not been updated.
# mdadm -V
mdadm - v4.1 - 2018-10-01

An `mdadm --action check /dev/md0` was run without errors.

1) What's the best option to restore the size without loosing the data?
2) Is this issue can be related to the kernel upgrade or it's fortuitous?

Hi,

I am very sorry for this problem. This is a bug in 5.10 which is fixed
in 5.10.1.
To fix it, please upgrade your kernel to 5.10.1 (or downgrade to previous
version). In many cases, the array should be back normal. If not, please try

       mdadm --grow --size <size> /dev/mdXXX.

If the original array uses the full disk/partition, you can use "max" for <size>
to safe some calculation.

Please let me know if you have future problem with it.

Thanks,
Song

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help