Re: RAID1 removing failed disk returns EBUSY
From: Joe Lawrence <hidden>
Date: 2015-01-23 15:11:29
On Tue, 20 Jan 2015 02:16:46 -0500 Xiao Ni [off-list ref] wrote:
Joe Thanks for the explanation. So echo "idle" to sync_action is a workaround without the patch. It looks like the patch is not enough to fix the problem. Do you have a try with the new patch? Is the problem still exist in your environment? If your environment have no problem, can you give me the version number? I'll have a try with the same version too.
Hi Xiao,
Bill and I did some more testing yesterday and I think we've figured
out the confusion. Running a 3.18+ kernel and an upstream mdadm, it
was the udev invocation of "mdadm -If <dev>" that was automatically
removing the device for us.
If we ran with an older mdadm and got the MD wedged in the faulty
condition, then nothing we echoed into the sysfs state file ('idle'
'fail' or 'remove') would change anything. I think this agrees with
your testing report.
So two things:
1 - Did you make / make install the latest mdadm and see it try to run
mdadm -If on the removed disk? (You could also try manually running
it.)
2 - I think the sysfs interface to the removed disks is still broken in
cases where (1) doesn't occur.
Thanks,
-- Joe