Re: [PATCH] block: fix deadlock between bd_link_disk_holder and partition scan
From: Song Liu <song@kernel.org>
Date: 2024-02-09 00:49:27
Also in:
linux-block, lkml
On Thu, Feb 8, 2024 at 12:44 AM Li Nan [off-list ref] wrote:
在 2024/2/8 14:50, Song Liu 写道:quoted
On Wed, Feb 7, 2024 at 1:32 AM [off-list ref] wrote:quoted
From: Li Nan <redacted> 'open_mutex' of gendisk is used to protect open/close block devices. But in bd_link_disk_holder(), it is used to protect the creation of symlink between holding disk and slave bdev, which introduces some issues. When bd_link_disk_holder() is called, the driver is usually in the process of initialization/modification and may suspend submitting io. At this time, any io hold 'open_mutex', such as scanning partitions, can cause deadlocks. For example, in raid: T1 T2 bdev_open_by_dev lock open_mutex [1] ... efi_partition ... md_submit_bio md_ioctl mddev_syspend -> suspend all io md_add_new_disk bind_rdev_to_array bd_link_disk_holder try lock open_mutex [2] md_handle_request -> wait mddev_resume T1 scan partition, T2 add a new device to raid. T1 waits for T2 to resume mddev, but T2 waits for open_mutex held by T1. Deadlock occurs. Fix it by introducing a local mutex 'holder_mutex' to replace 'open_mutex'.Is this to fix [1]? Do we need some Fixes and/or Closes tags?No. Just use another way to fix [2], and both [2] and this patch can fix the issue. I am not sure about the root cause of [1] yet. [2] https://patchwork.kernel.org/project/linux-raid/list/?series=812045quoted
Could you please add steps to reproduce this issue?We need to modify the kernel, add sleep in md_submit_bio() and md_ioctl() as below, and then: 1. mdadm -CR /dev/md0 -l1 -n2 /dev/sd[bc] #create a raid 2. echo 1 > /sys/module/md_mod/parameters/error_inject #enable sleep 3. 'mdadm --add /dev/md0 /dev/sda' #add a disk to raid 4. submit ioctl BLKRRPART to raid within 10s.
The analysis makes sense. I also hit the issue a couple times without adding extra delays. But I am not sure whether this is the best fix (I didn't find real issues with it either). Maybe we don't need to suspend the array for ADD_NEW_DISK? So that something like the following might just work? Thanks, Song
@@ -7573,7 +7577,6 @@ static inline bool md_ioctl_valid(unsigned int cmd) static bool md_ioctl_need_suspend(unsigned int cmd) { switch (cmd) { - case ADD_NEW_DISK: case HOT_ADD_DISK: case HOT_REMOVE_DISK: case SET_BITMAP_FILE: