Re: [PATCH 2/6] common: capture metadump output if xfs filesystem check fails
From: "Darrick J. Wong" <djwong@kernel.org>
Date: 2021-02-11 18:27:54
Also in:
fstests
On Thu, Feb 11, 2021 at 08:59:58AM -0500, Brian Foster wrote:
On Tue, Feb 09, 2021 at 06:56:30PM -0800, Darrick J. Wong wrote:quoted
From: Darrick J. Wong <redacted> Capture metadump output when various userspace repair and checker tools fail or indicate corruption, to aid in debugging. We don't bother to annotate xfs_check because it's bitrotting. Signed-off-by: Darrick J. Wong <djwong@kernel.org> --- README | 2 ++ common/xfs | 26 ++++++++++++++++++++++++++ 2 files changed, 28 insertions(+)diff --git a/README b/README index 43bb0cee..36f72088 100644 --- a/README +++ b/README@@ -109,6 +109,8 @@ Preparing system for tests: - Set TEST_FS_MODULE_RELOAD=1 to unload the module and reload it between test invocations. This assumes that the name of the module is the same as FSTYP. + - Set SNAPSHOT_CORRUPT_XFS=1 to record compressed metadumps of XFS + filesystems if the various stages of _check_xfs_filesystem fail. - or add a case to the switch in common/config assigning these variables based on the hostname of your testdiff --git a/common/xfs b/common/xfs index 2156749d..ad1eb6ee 100644 --- a/common/xfs +++ b/common/xfs@@ -432,6 +432,21 @@ _supports_xfs_scrub() return 0 } +# Save a compressed snapshot of a corrupt xfs filesystem for later debugging. +_snapshot_xfs() {The term snapshot has a well known meaning. Can we just call this _metadump_xfs()?
Ok.
quoted
+ local metadump="$1" + local device="$2" + local logdev="$3" + local options="-a -o" + + if [ "$logdev" != "none" ]; then + options="$options -l $logdev" + fi + + $XFS_METADUMP_PROG $options "$device" "$metadump" >> "$seqres.full" 2>&1 + gzip -f "$metadump" >> "$seqres.full" 2>&1 &Why compress in the background?
Sometimes the metadumps can become very large and I don't tend to have a lot of space on the test appliances for storing blobs. Also, I was under the impression that it was customary for people to share compressed metadumps of crashes, so why not save everyone a step? I do this in the background to avoid holding up the next fstest.
I wonder if we should just skip the compression step since this requires an option to enable in the first place..
Seeing as it's optional, I think that's all the more reason to compress.
quoted
+} + # run xfs_check and friends on a FS. _check_xfs_filesystem() {...quoted
@@ -540,6 +564,8 @@ _check_xfs_filesystem() cat $tmp.repair >>$seqres.full echo "*** end xfs_repair output" >>$seqres.full + test "$SNAPSHOT_CORRUPT_XFS" = "1" && \ + _snapshot_xfs "$seqres.rebuildrepair.md" "$device" "$2"Why do we collect so many metadump images? Shouldn't all but the last TEST_XFS_REPAIR_REBUILD thing not modify the fs? If so, it seems like we should be able to collect one image (and perhaps just call it "$seqres.$device.md") if any of the first several checks flag a problem.
Yes, the number of metadumps collected can be reduced to two. One if scrub or logprint or repair -n fail, and a second one if the user set TEST_XFS_REPAIR_REBUILD=1 and either the repair or the repair -n fail. Will change that. --D
Brianquoted
ok=0 fi rm -f $tmp.repair