Re: [PATCH 1/2] btrfs-progs: Ignore path device during device scan
From: Anand Jain <hidden>
Date: 2021-09-30 01:02:44
On 29/09/2021 20:51, Nikolay Borisov wrote:
On 29.09.21 г. 15:44, Anand Jain wrote:quoted
quoted
quoted
quoted
flap means going up and down. The gist is that btrfs fi show would show the latest device being scanned, which in the case of multipath device could be any of the paths.But why the problem is only when a device flaps? Or it doesn't matter?
Because when the device re-appears it will be the last device scanned by btrfs scanning code.
It shouldn't be so fragile that we merely depend on the order of the
btrfs device enumeration. Device orders can be random. It shouldn't
matter. Even if it is working fine before the disappear-reappear cycle,
it is just by luck.
To show the list of btrfs devices that are unmounted, we use
btrfs_scan_devices(). In btrfs_scan_devices(), we extensively use
libblkid to enumerate btrfs devices. Few important lines of it are
as below.
-----------------
367 int btrfs_scan_devices(int verbose)
<snip>
381 ret = blkid_get_cache(&cache, NULL);
382 if (ret < 0) {
383 errno = -ret;
384 error("blkid cache get failed: %m");
385 return ret;
386 }
387 blkid_probe_all(cache);
388 iter = blkid_dev_iterate_begin(cache);
389 blkid_dev_set_search(iter, "TYPE", "btrfs");
390 while (blkid_dev_next(iter, &dev) == 0) {
391 dev = blkid_verify(cache, dev);
392 if (!dev)
393 continue;
394 /* if we are here its definitely a btrfs disk*/
395 strncpy_null(path, blkid_dev_devname(dev));
-----------------
So, you mean to say the blkid_dev_set_search() always finds all three
paths containing the same fsid+uuid+devid?
But, only its order varies when the a underlying device disappears and
reappears.
For example:
Device order before /dev/sdb disappeared
/dev/sda MAJ:MIN??
/dev/sdb MAJ:MIN??
/dev/mapper/3600140501cc1f49e5364f0093869c763 MAJ:MIN??
Device order after /dev/sdb reappeared
/dev/sda MAJ:MIN??
/dev/mapper/3600140501cc1f49e5364f0093869c763 MAJ:MIN??
/dev/sdb MAJ:MIN??
Could you please help to find the MAJ:MIN of the devices before and
after the disappear-reappear cycle? Are we sure the reappeared device
has the same MAJ:MIN as before? is it shown as a new device? If not
then it could be block layer problem too.
So as said we shouldn't depend on the order of the device enumeration.
Your fix makes sense to an extent but, still depend on the device
order of reporting. Let us say we dd a device to another device
then, btrfs fi show will show the last enumerated device by blkid
(I think). If yes then, it is wrong.
Could we make it order neutral? I don't know how. I think the only
choice is to list all the devices in a tree format. Similar to
lsblk(8).
Thanks, Anand
It can be reproduced by following steps: > 1) Validate 'btrfs fi show' is showing /dev/mapper/xxxx for all fs's 1) Unplug one of the cables from the FC adapter < this can be simulated by simply doing 'echo 1 > /sys/block/sdd/device/delete' for the given path device > 2) Wait for the paths/drives associated with the downed port to disappear 3) Check again that 'btrfs fi show' still lists the /dev/mapper entry 4) Reattach the cable to the hba port <this can be simulated by rescanning the HBA or w/e bus you have: echo "- - -" > /sys/class/scsi_host/host1/scan > 5) Check that 'btrfs fi show' is now shows /dev/sdX devices for all mpio fs'squoted
quoted
quoted
Do you mean 'btrfs fi show' shows a device of the multi-path however, 'btrfs fi show -m' shows the correct multi-path device?Yes, that's a problem purely in btrfs-progs, as the path devices are opened exclusively for the purpose of multiapth.Ok. All parts of the test case is with an unmounted btrfs, I am clarified, now.