Re: [PATCH] generic/335: explicitly fsync file foo when running on btrfs
From: Filipe Manana <fdmanana@kernel.org>
Date: 2021-12-10 10:51:13
Also in:
linux-btrfs
On Fri, Dec 10, 2021 at 12:32 AM Dave Chinner [off-list ref] wrote:
On Thu, Dec 09, 2021 at 02:44:06PM +0000, fdmanana@kernel.org wrote:quoted
From: Filipe Manana <redacted> The test is relying on the fact that an fsync on directory "a" will result in persisting the changes of its subdirectory "b", namely the rename of "/a/b/foo" to "/c/foo". For this particular filesystem layout, that will happen on btrfs, because all the directory entries end up in the same metadata leaf. However that is not a behaviour we can always guarantee on btrfs. For example, if we add more files to directory "a" before and after creating subdirectory "b", like this: mkdir $SCRATCH_MNT/a for ((i = 0; i < 1000; i++)); do echo -n > $SCRATCH_MNT/a/file_$i done mkdir $SCRATCH_MNT/a/b for ((i = 1000; i < 2000; i++)); do echo -n > $SCRATCH_MNT/a/file_$i done mkdir $SCRATCH_MNT/c touch $SCRATCH_MNT/a/b/foo sync # The rest of the test remains unchanged... (...) Then after fsyncing only directory "a", the rename of file "foo" from "/a/b/foo" to "/c/foo" is not persisted. Guaranteeing that on btrfs would be expensive on large directories, as it would require scanning for every subdirectory. It's also not required by posix for the fsync on a directory to persist changes inside its subdirectories. So add an explicit fsync on file "foo" when the filesystem being tested is btrfs. Signed-off-by: Filipe Manana <redacted> --- tests/generic/335 | 9 +++++++++ 1 file changed, 9 insertions(+)diff --git a/tests/generic/335 b/tests/generic/335 index e04f7a5f..196ada64 100755 --- a/tests/generic/335 +++ b/tests/generic/335@@ -51,6 +51,15 @@ mv $SCRATCH_MNT/a/b/foo $SCRATCH_MNT/c/ touch $SCRATCH_MNT/a/bar $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a +# btrfs can not guarantee that when we fsync a directory all its subdirectories +# created on past transactions are fsynced as well. It may do it sometimes, but +# it's not guaranteed, giving such guarantees would be too expensive for large +# directories and posix does not require that recursive behaviour. So if we want +# the rename of "foo" to be persisted, explicitly fsync "foo". +if [ $FSTYP == "btrfs" ]; then + $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/c/foo +fiDoesn't this imply a regression in the behaviour of btrfs? After all, this test was written explicitly to exercise a bug in btrfs log recovery: commit 72da52702a4b5bea1170aed791893487d0748566 Author: Filipe Manana [off-list ref] Date: Fri Feb 19 12:15:17 2016 +1100 generic: test directory fsync after rename operation Test that if we move one file between directories, fsync the parent directory of the old directory, power fail and remount the filesystem, the file is not lost and it's located at the destination directory. This is motivated by a bug found in btrfs, which is fixed by the patch (for the linux kernel) titled: "Btrfs: fix file loss on log replay after renaming a file and fsync" Tested against ext3, ext4, xfs, f2fs and reiserfs. Signed-off-by: Filipe Manana [off-list ref] Reviewed-by: Dave Chinner [off-list ref] Signed-off-by: Dave Chinner [off-list ref] What's changed that now makes this an invalid test for btrfs log recovery? Has btrfs loosened it's metadata ordering guarantees in the past few years?
No, btrfs has not loosened its metadata ordering guarantees. The test was written because btrfs had a bug where it would lose file foo if we fsynced only directory "/a". I should have made the test check that after log replay, file foo exists either at "/c/foo" or at "/a/b/foo" - that the file was not lost. Perhaps that's a better change than making the test fsync file foo explicitly. As I mentioned in the changelog, in this particular case the rename is persisted, because all the metadata is very small and fits in a single leaf of metadata. But, for example, adding those 1000 files after creating "/a" and another 1000 files after creating "/a/b", makes the test fail because it expects the rename to have been persisted, that file foo is at "/c/foo" - in that case the file is still at "/a/b/foo". Thanks.
Cheers, Dave. -- Dave Chinner david@fromorbit.com