Re: RAID1 removing failed disk returns EBUSY
From: Xiao Ni <hidden>
Date: 2015-01-20 07:16:46
----- Original Message -----
From: "Joe Lawrence" <redacted> To: "Xiao Ni" <redacted> Cc: "NeilBrown" <redacted>, linux-raid@vger.kernel.org, "Bill Kuzeja" <redacted> Sent: Tuesday, January 20, 2015 1:56:50 AM Subject: Re: RAID1 removing failed disk returns EBUSY On Sun, 18 Jan 2015 21:33:50 -0500 Xiao Ni [off-list ref] wrote:quoted
----- Original Message -----quoted
From: "Joe Lawrence" <redacted> To: "Xiao Ni" <redacted> Cc: "NeilBrown" <redacted>, linux-raid@vger.kernel.org, "Bill Kuzeja" [off-list ref] Sent: Friday, January 16, 2015 11:10:31 PM Subject: Re: RAID1 removing failed disk returns EBUSY On Fri, 16 Jan 2015 00:20:12 -0500 Xiao Ni [off-list ref] wrote:quoted
Hi Joe Thanks for reminding me. I didn't do that. Now it can remove successfully after writing "idle" to sync_action. I thought wrongly that the patch referenced in this mail is fixed for the problem.So it sounds like even with 3.18 and a new mdadm, this bug still persists? -- Joe --Hi Joe I'm a little confused now. Does the patch 45eaf45dfa4850df16bc2e8e7903d89021137f40 from linux-stable resolve the problem? My environment is: [root@dhcp-12-133 mdadm]# mdadm --version mdadm - v3.3.2-18-g93d3bd3 - 18th December 2014 (this is the newest upstream) [root@dhcp-12-133 mdadm]# uname -r 3.18.2 My steps are: [root@dhcp-12-133 mdadm]# lsblk sdb 8:16 0 931.5G 0 disk └─sdb1 8:17 0 5G 0 part sdc 8:32 0 186.3G 0 disk sdd 8:48 0 931.5G 0 disk └─sdd1 8:49 0 5G 0 part [root@dhcp-12-133 mdadm]# mdadm -CR /dev/md0 -l1 -n2 /dev/sdb1 /dev/sdd1 --assume-clean mdadm: Note: this array has metadata at the start and may not be suitable as a boot device. If you plan to store '/boot' on this device please ensure that your boot-loader understands md/v1.x metadata, or use --metadata=0.90 mdadm: Defaulting to version 1.2 metadata mdadm: array /dev/md0 started. Then I unplug the disk. [root@dhcp-12-133 mdadm]# lsblk sdc 8:32 0 186.3G 0 disk sdd 8:48 0 931.5G 0 disk └─sdd1 8:49 0 5G 0 part └─md0 9:0 0 5G 0 raid1 [root@dhcp-12-133 mdadm]# echo faulty > /sys/block/md0/md/dev-sdb1/state [root@dhcp-12-133 mdadm]# echo remove > /sys/block/md0/md/dev-sdb1/state -bash: echo: write error: Device or resource busy [root@dhcp-12-133 mdadm]# echo idle > /sys/block/md0/md/sync_action [root@dhcp-12-133 mdadm]# echo remove > /sys/block/md0/md/dev-sdb1/state Now after I set idle to sync_action, it can be removed as you said in the mail. It's a good workaround. Is this OK? Best Regards XiaoHi Xiao, According to my notes, the "idle" sync_action was always a viable workaround, with or with this change. Neil's patch should have made it possible to issue only a "faulty" and "remove" to remove the RAID component. I don't have an exact version, but it appears that my mdadm version was an upstream git from Oct 27-th timeframe. -- Joe
Joe Thanks for the explanation. So echo "idle" to sync_action is a workaround without the patch. It looks like the patch is not enough to fix the problem. Do you have a try with the new patch? Is the problem still exist in your environment? If your environment have no problem, can you give me the version number? I'll have a try with the same version too. Best Regards Xiao -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html