Re: [PATCH] md-cluster: fix use-after-free issue when removing rdev
From: Paul Menzel <hidden>
Date: 2021-04-08 06:33:56
Dear Heming, Am 08.04.21 um 07:52 schrieb heming.zhao@suse.com:
On 4/8/21 1:09 PM, Paul Menzel wrote:
quoted
Am 08.04.21 um 05:01 schrieb Heming Zhao:quoted
md_kick_rdev_from_array will remove rdev, so we should use rdev_for_each_safe to search list. How to trigger:for i in {1..20}; do echo ==== $i `date` ====; mdadm -Ss && ssh ${node2} "mdadm -Ss" wipefs -a /dev/sda /dev/sdb mdadm -CR /dev/md0 -b clustered -e 1.2 -n 2 -l 1 /dev/sda \ /dev/sdb --assume-clean ssh ${node2} "mdadm -A /dev/md0 /dev/sda /dev/sdb" mdadm --wait /dev/md0 ssh ${node2} "mdadm --wait /dev/md0" mdadm --manage /dev/md0 --fail /dev/sda --remove /dev/sda sleep 1 doneIn the test script, I do not understand, what node2 is used for, where you log in over SSH.The bug can only be triggered in cluster env. There are two nodes (in cluster), To run this script on node1, and need ssh to node2 to execute some cmds. ${node2} stands for node2 ip address. e.g.: ssh 192.168.0.3 "mdadm --wait ..."
Please excuse my ignorance. I guess some other component is needed to connect the two RAID devices on each node? At least you never tell mdadm directly to use *node2*. Reading *Cluster Multi-device (Cluster MD)* [1] a resource agent is needed.
quoted
quoted
... ... Signed-off-by: Heming Zhao <redacted> Reviewed-by: Gang He <redacted>If there is a commit, your patch is fixing, please add a Fixes: tag.OK, I forgot it, will send v2 patch later.
Awesome. Kind regards, Paul [1]: https://documentation.suse.com/sle-ha/12-SP4/html/SLE-HA-all/cha-ha-cluster-md.html