Re: [PATCH -next v2 6/6] ext4: fix possible store wrong check interval value in disk when umount
From: yebin <hidden>
Date: 2021-10-08 03:49:17
Also in:
lkml
On 2021/10/7 21:12, Jan Kara wrote:
On Sat 11-09-21 17:00:59, Ye Bin wrote:quoted
Test follow steps: 1. mkfs.ext4 /dev/sda -O mmp 2. mount /dev/sda /mnt 3. wait for about 1 minute 4. umount mnt 5. debugfs /dev/sda 6. dump_mmp 7. fsck.ext4 /dev/sda I found 'check_interval' is range in [5, 10]. And sometime run fsck print "MMP interval is 10 seconds and total wait time is 42 seconds. Please wait...". kmmpd: ... if (diff < mmp_update_interval * HZ) schedule_timeout_interruptible(mmp_update_interval * HZ - diff); diff = jiffies - last_update_time; ... mmp_check_interval = max(min(EXT4_MMP_CHECK_MULT * diff / HZ, EXT4_MMP_MAX_CHECK_INTERVAL), EXT4_MMP_MIN_CHECK_INTERVAL); mmp->mmp_check_interval = cpu_to_le16(mmp_check_interval); ... We will call ext4_stop_mmpd to stop kmmpd kthread when umount, and schedule_timeout_interruptible will be interrupted, so 'diff' maybe little than mmp_update_interval. Then mmp_check_interval will range in [EXT4_MMP_MAX_CHECK_INTERVAL, EXT4_MMP_CHECK_MULT * diff / HZ]. To solve this issue, if 'diff' little then mmp_update_interval * HZ just break loop, don't update check interval. Signed-off-by: Ye Bin <redacted> --- fs/ext4/mmp.c | 8 ++++++++ 1 file changed, 8 insertions(+)diff --git a/fs/ext4/mmp.c b/fs/ext4/mmp.c index a0d47a906faa..f39e1fa0c6db 100644 --- a/fs/ext4/mmp.c +++ b/fs/ext4/mmp.c@@ -205,6 +205,14 @@ static int kmmpd(void *data) schedule_timeout_interruptible(mmp_update_interval * HZ - diff); diff = jiffies - last_update_time; + /* If 'diff' little 'than mmp_update_interval * HZ', it + * means someone call ext4_stop_mmpd to stop kmmpd + * kthread. We don't need to update mmp_check_interval + * any more, as 'diff' is not exact value. + */ + if (unlikely(diff < mmp_update_interval * HZ && + kthread_should_stop())) + break; }So in this case, mmp_check_interval would be EXT4_MMP_MIN_CHECK_INTERVAL. I don't quite understand what the practical problem is - the fsck message? That will happen anytime mmp_check_interval is >= 10 AFAICT and I don't quite see how that is connected to this condition... Can you explain a bit more please? Honza
I just think 'mmp_check_interval' is not reflect real check interval, and also sometime run fsck print "MMP interval is 10 seconds and total wait time is 42 seconds. Please wait...", but sometime not.