Re: Re: Raid1 of a slow hdd and a fast(er) SSD, howto to prioritize the SSD?

From: Zygo Blaxell <hidden>
Date: 2021-01-09 21:41:17

On Fri, Jan 08, 2021 at 08:29:45PM +0100, Andrea Gelmini wrote:

Il giorno ven 8 gen 2021 alle ore 09:36 [off-list ref] ha scritto:

quoted

What happens when I poison one of the drives in the mdadm array using this command? Will all data come out OK?
dd if=/dev/urandom of=/dev/dev/sdb1 bs=1M count = 100?

<smiling>
Well, (happens) the same thing when your laptop is stolen or you read
"open_ctree failed"...You restore backup...
</smiling>

I have a few idea, but it's much more quicker to try it. Let's see:

truncate -s 5G dev1
truncate -s 5G dev2
losetup /dev/loop31 dev1
losetup /dev/loop32 dev2
mdadm --create --verbose --assume-clean /dev/md0 --level=1
--raid-devices=2 /dev/loop31 --write-mostly /dev/loop32

Note that with --write-mostly here, total filesystem loss is no longer
random: mdadm will always pick loop31 over loop32 while loop31 exists.

mkfs.btrfs /dev/md0
mount -o compress=lzo /dev/md0 /mnt/sg10/
cd /mnt/sg10/
cp -af /home/gelma/dev/kernel/ .
root@glet:/mnt/sg10# dmesg -T
[Fri Jan  8 19:51:33 2021] md/raid1:md0: active with 2 out of 2 mirrors
[Fri Jan  8 19:51:33 2021] md0: detected capacity change from 0 to 5363466240
[Fri Jan  8 19:51:53 2021] BTRFS: device fsid
2fe43610-20e5-48de-873d-d1a6c2db2a6a devid 1 transid 5 /dev/md0
scanned by mkfs.btrfs (512004)
[Fri Jan  8 19:51:53 2021] md: data-check of RAID array md0
[Fri Jan  8 19:52:19 2021] md: md0: data-check done.
[Fri Jan  8 19:53:13 2021] BTRFS info (device md0): setting incompat
feature flag for COMPRESS_LZO (0x8)
[Fri Jan  8 19:53:13 2021] BTRFS info (device md0): use lzo compression, level 0
[Fri Jan  8 19:53:13 2021] BTRFS info (device md0): disk space caching
is enabled
[Fri Jan  8 19:53:13 2021] BTRFS info (device md0): has skinny extents
[Fri Jan  8 19:53:13 2021] BTRFS info (device md0): flagging fs with
big metadata feature
[Fri Jan  8 19:53:13 2021] BTRFS info (device md0): enabling ssd optimizations
[Fri Jan  8 19:53:13 2021] BTRFS info (device md0): checking UUID tree

root@glet:/mnt/sg10# btrfs scrub start -B .
scrub done for 2fe43610-20e5-48de-873d-d1a6c2db2a6a
Scrub started:    Fri Jan  8 20:01:59 2021
Status:           finished
Duration:         0:00:04
Total to scrub:   4.99GiB
Rate:             1.23GiB/s
Error summary:    no errors found

We check the array is in sync:

root@glet:/mnt/sg10# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md0 : active raid1 loop32[1](W) loop31[0]
     5237760 blocks super 1.2 [2/2] [UU]

You have used --assume-clean and didn't tell mdadm otherwise since,
so this test didn't provide any information.

On real disks a mdadm integrity check at this point fail very hard since
the devices have never been synced (unless they are both blank devices
filled with the same formatting test pattern or zeros).

unused devices: <none>

Now we wipe the storage;
root@glet:/mnt/sg10# dd if=/dev/urandom of=/dev/loop32 bs=1M count=100

With --write-mostly, the above deterministically works, and

	dd if=/dev/urandom of=/dev/loop31 bs=1M count=100

deterministically damages or destroys the filesystem.

With real disk failures you don't get to pick which drive is corrupted
or when.  If it's the remote drive, you have no backup and have no way
to _know_ you have no backup.  If it's the local drive, you can recover
it if you read from the backup in time; otherise, you lose your data
permanently on the next mdadm resync.

100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.919025 s, 114 MB/s

sync

echo 3 > /proc/sys/vm/drop_caches

I do rm to force write i/o:

root@glet:/mnt/sg10# rm kernel/v5.11/ -rf

root@glet:/mnt/sg10# btrfs scrub start -B .
scrub done for 2fe43610-20e5-48de-873d-d1a6c2db2a6a
Scrub started:    Fri Jan  8 20:11:21 2021
Status:           finished
Duration:         0:00:03
Total to scrub:   4.77GiB
Rate:             1.54GiB/s
Error summary:    no errors found

This scrub will never detect corruption on the remote filesystem because
of --write-mostly, so you have no way to know whether it has bitrotted
away (or is just missing a whole lot of updates).

Now, I stop the array and re-assembly:
mdadm -Ss

root@glet:/# mdadm --assemble /dev/md0 /dev/loop31 /dev/loop32
mdadm: /dev/md0 has been started with 2 drives.

root@glet:/# mount /dev/md0 /mnt/sg10/
root@glet:/# btrfs scrub start -B  /mnt/sg10/
scrub done for 2fe43610-20e5-48de-873d-d1a6c2db2a6a
Scrub started:    Fri Jan  8 20:15:16 2021
Status:           finished
Duration:         0:00:03
Total to scrub:   4.77GiB
Rate:             1.54GiB/s
Error summary:    no errors found

Ciao,
Gelma

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help