Thread (3 messages) 3 messages, 2 authors, 2021-12-10

Re: [PATCH] generic/335: explicitly fsync file foo when running on btrfs

From: Filipe Manana <fdmanana@kernel.org>
Date: 2021-12-10 10:51:13
Also in: linux-btrfs

On Fri, Dec 10, 2021 at 12:32 AM Dave Chinner [off-list ref] wrote:
On Thu, Dec 09, 2021 at 02:44:06PM +0000, fdmanana@kernel.org wrote:
quoted
From: Filipe Manana <redacted>

The test is relying on the fact that an fsync on directory "a" will
result in persisting the changes of its subdirectory "b", namely the
rename of "/a/b/foo" to "/c/foo". For this particular filesystem layout,
that will happen on btrfs, because all the directory entries end up in
the same metadata leaf. However that is not a behaviour we can always
guarantee on btrfs. For example, if we add more files to directory
"a" before and after creating subdirectory "b", like this:

  mkdir $SCRATCH_MNT/a
  for ((i = 0; i < 1000; i++)); do
      echo -n > $SCRATCH_MNT/a/file_$i
  done
  mkdir $SCRATCH_MNT/a/b
  for ((i = 1000; i < 2000; i++)); do
      echo -n > $SCRATCH_MNT/a/file_$i
  done
  mkdir $SCRATCH_MNT/c
  touch $SCRATCH_MNT/a/b/foo

  sync

  # The rest of the test remains unchanged...
  (...)

Then after fsyncing only directory "a", the rename of file "foo" from
"/a/b/foo" to "/c/foo" is not persisted.

Guaranteeing that on btrfs would be expensive on large directories, as
it would require scanning for every subdirectory. It's also not required
by posix for the fsync on a directory to persist changes inside its
subdirectories. So add an explicit fsync on file "foo" when the filesystem
being tested is btrfs.

Signed-off-by: Filipe Manana <redacted>
---
 tests/generic/335 | 9 +++++++++
 1 file changed, 9 insertions(+)
diff --git a/tests/generic/335 b/tests/generic/335
index e04f7a5f..196ada64 100755
--- a/tests/generic/335
+++ b/tests/generic/335
@@ -51,6 +51,15 @@ mv $SCRATCH_MNT/a/b/foo $SCRATCH_MNT/c/
 touch $SCRATCH_MNT/a/bar
 $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a

+# btrfs can not guarantee that when we fsync a directory all its subdirectories
+# created on past transactions are fsynced as well. It may do it sometimes, but
+# it's not guaranteed, giving such guarantees would be too expensive for large
+# directories and posix does not require that recursive behaviour. So if we want
+# the rename of "foo" to be persisted, explicitly fsync "foo".
+if [ $FSTYP == "btrfs" ]; then
+     $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/c/foo
+fi
Doesn't this imply a regression in the behaviour of btrfs? After
all, this test was written explicitly to exercise a bug in btrfs log
recovery:

commit 72da52702a4b5bea1170aed791893487d0748566
Author: Filipe Manana [off-list ref]
Date:   Fri Feb 19 12:15:17 2016 +1100

    generic: test directory fsync after rename operation

    Test that if we move one file between directories, fsync the parent
    directory of the old directory, power fail and remount the filesystem,
    the file is not lost and it's located at the destination directory.

    This is motivated by a bug found in btrfs, which is fixed by the patch
    (for the linux kernel) titled:

      "Btrfs: fix file loss on log replay after renaming a file and fsync"

    Tested against ext3, ext4, xfs, f2fs and reiserfs.

    Signed-off-by: Filipe Manana [off-list ref]
    Reviewed-by: Dave Chinner [off-list ref]
    Signed-off-by: Dave Chinner [off-list ref]

What's changed that now makes this an invalid test for btrfs log
recovery? Has btrfs loosened it's metadata ordering guarantees in
the past few years?
No, btrfs has not loosened its metadata ordering guarantees.
The test was written because btrfs had a bug where it would lose file
foo if we fsynced only directory "/a".

I should have made the test check that after log replay, file foo
exists either at "/c/foo" or at "/a/b/foo" - that the file was not
lost.
Perhaps that's a better change than making the test fsync file foo explicitly.

As I mentioned in the changelog, in this particular case the rename is
persisted,
because all the metadata is very small and fits in a single leaf of metadata.

But, for example, adding those 1000 files after creating "/a" and
another 1000 files after creating "/a/b", makes
the test fail because it expects the rename to have been persisted,
that file foo is at "/c/foo" - in that case
the file is still at "/a/b/foo".

Thanks.


Cheers,

Dave.
--
Dave Chinner
david@fromorbit.com
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help