Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed
From: Jan Kara <jack@suse.cz>
Date: 2023-03-23 10:52:56
Also in:
linux-block, lkml
On Thu 23-03-23 00:08:51, Ming Lei wrote:
On Wed, Mar 22, 2023 at 02:07:09PM +0100, Jan Kara wrote:quoted
On Wed 22-03-23 19:34:30, Ming Lei wrote:quoted
On Wed, Mar 22, 2023 at 10:47:07AM +0100, Jan Kara wrote:quoted
On Wed 22-03-23 15:58:35, Ming Lei wrote:quoted
On Wed, Mar 22, 2023 at 11:59:26AM +0800, Yu Kuai wrote:quoted
From: Yu Kuai <redacted> Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still set, and partition scan will be proceed again when blkdev_get_by_dev() is called. However, this will cause a problem that re-assemble partitioned raid device will creat partition for underlying disk. Test procedure: mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0 sgdisk -n 0:0:+100MiB /dev/md0 blockdev --rereadpt /dev/sda blockdev --rereadpt /dev/sdb mdadm -S /dev/md0 mdadm -A /dev/md0 /dev/sda /dev/sdb Test result: underlying disk partition and raid partition can be observed at the same time Note that this can still happen in come corner cases that GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid device. Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again") Signed-off-by: Yu Kuai <redacted>The issue still can't be avoided completely, such as, after rebooting, /dev/sda1 & /dev/md0p1 can be observed at the same time. And this one should be underlying partitions scanned before re-assembling raid, I guess it may not be easy to avoid.So this was always happening (before my patches, after my patches, and now after Yu's patches) and kernel does not have enough information to know that sda will become part of md0 device in the future. But mdadm actually deals with this as far as I remember and deletes partitions for all devices it is assembling the array from (and quick tracing experiment I did supports this).I am testing on Fedora 37, so mdadm v4.2 doesn't delete underlying partitions before re-assemble.Strange, I'm on openSUSE Leap 15.4 and mdadm v4.1 deletes these partitions (at least I can see mdadm do BLKPG_DEL_PARTITION ioctls). And checking mdadm sources I can see calls to remove_partitions() from start_array() function in Assemble.c so I'm not sure why this is not working for you...I added dump_stack() in delete_partition() for partition 1, not observe stack trace during booting.quoted
quoted
Also given mdadm or related userspace has to change for avoiding to scan underlying partitions, just wondering why not let userspace to tell kernel not do it explicitly?Well, those userspace changes are long deployed, now you would introduce new API that needs to proliferate again. Not very nice. Also how would that exactly work? I mean once mdadm has underlying device open, the current logic makes sure we do not create partitions anymore. But there's no way how mdadm could possibly prevent creation of partitions for devices it doesn't know about yet so it still has to delete existing partitions...I meant if mdadm has to change to delete existed partitions, why not add one ioctl to disable partition scan for this disk when deleting partitions/re-assemble, and re-enable scan after stopping array. But looks it isn't so, since you mentioned that remove_partitions is supposed to be called before starting array, however I didn't observe this behavior.
Yeah, not sure what's happening on your system.
I am worrying if the current approach may cause regression, one concern is that ioctl(BLKRRPART) needs exclusive open now, such as: 1) mount /dev/vdb1 /mnt 2) ioctl(BLKRRPART) may fail after removing /dev/vdb3
Well, but we always had some variant of:
if (disk->open_partitions)
return -EBUSY;
in disk_scan_partitions(). So as long as any partition on the disk is used,
EBUSY is the correct return value from BLKRRPART.
Honza
--
Jan Kara [off-list ref]
SUSE Labs, CR