Re: [PATCH 2/7] md: fix a race condition when stopping the sync thread
From: Mikulas Patocka <mpatocka@redhat.com>
Date: 2024-01-18 13:28:09
Also in:
dm-devel
On Thu, 18 Jan 2024, Yu Kuai wrote:
Hi, 在 2024/01/18 21:07, Mikulas Patocka 写道:quoted
On Thu, 18 Jan 2024, Yu Kuai wrote:quoted
Hi, 在 2024/01/18 2:18, Mikulas Patocka 写道:quoted
Note that md_wakeup_thread_directly is racy - it will do nothing if the thread is already running or it may cause spurious wake-up if the thread is blocked in another subsystem.No, as the comment said, md_wakeup_thread_directly() is just to prevent that md_wakeup_thread() can't wake up md_do_sync() if it's waiting for metadata update.Yes - but what happens if you wake up the thread just a few instructions before it is going to sleep for metadata update? wake_up_process does nothing on a running process and the thread proceeds with waiting. This is what I thought could happen when I was making the patch.Please notice that in the orginal code md_wakeup_thread_directly() is used for sync_thread, and md_wakeup_thread() should be used for *mddev->thread* (mddev_unlock always do that) to clear MD_RECOVERY_RUNNING. By the way, the root cause that MD_RECOVERY_RUNNING is not cleared is that mddev_suspend() never stop sync_thread at all, while md_check_recovery() won't do anything when mddev is suspended. Before: 1. suspend 2. call md_reap_sync_thread() directly to unregister sync_thread -> notice that this is not safe. 3. resume Now: 1. suspend 2. call stop_sync_thread() to unregister sync_thread interrupt md_do_sync() and wait for md_check_recovery() to clear MD_RECOVERY_RUNNING. -> which will never happen now; 3. resume I fixed this locally and the test integrity-caching.sh passed in my VM. Thanks, Kuai
OK, Thanks. Mikulas