Thread (6 messages) 6 messages, 2 authors, 2021-03-23

Re: parent transid verify failed / ERROR: could not setup extent tree

From: Chris Murphy <hidden>
Date: 2021-03-22 19:50:22

On Mon, Mar 22, 2021 at 12:32 AM Dave T [off-list ref] wrote:
On Sun, Mar 21, 2021 at 2:03 PM Chris Murphy [off-list ref] wrote:
quoted
On Sat, Mar 20, 2021 at 11:54 PM Dave T [off-list ref] wrote:
quoted
# btrfs check -r 2853787942912 /dev/mapper/xyz
Opening filesystem to check...
parent transid verify failed on 2853787942912 wanted 29436 found 29433
parent transid verify failed on 2853787942912 wanted 29436 found 29433
parent transid verify failed on 2853787942912 wanted 29436 found 29433
Ignoring transid failure
parent transid verify failed on 2853827723264 wanted 29433 found 29435
parent transid verify failed on 2853827723264 wanted 29433 found 29435
parent transid verify failed on 2853827723264 wanted 29433 found 29435
Ignoring transid failure
leaf parent key incorrect 2853827723264
ERROR: could not setup extent tree
ERROR: cannot open file system
btrfs insp dump-t -t 2853827723264 /dev/
# btrfs insp dump-t -t 2853827723264 /dev/mapper/xzy
btrfs-progs v5.11
parent transid verify failed on 2853827608576 wanted 29436 found 29433
parent transid verify failed on 2853827608576 wanted 29436 found 29433
parent transid verify failed on 2853827608576 wanted 29436 found 29433
Ignoring transid failure
leaf parent key incorrect 2853827608576
WARNING: could not setup extent tree, skipping it
parent transid verify failed on 2853827608576 wanted 29436 found 29433
Ignoring transid failure
leaf parent key incorrect 2853827608576
Couldn't setup device tree
ERROR: unable to open /dev/mapper/xzy

# btrfs insp dump-t -t 2853787942912 /dev/mapper/xzy
btrfs-progs v5.11
parent transid verify failed on 2853827608576 wanted 29436 found 29433
parent transid verify failed on 2853827608576 wanted 29436 found 29433
parent transid verify failed on 2853827608576 wanted 29436 found 29433
Ignoring transid failure
leaf parent key incorrect 2853827608576
WARNING: could not setup extent tree, skipping it
parent transid verify failed on 2853827608576 wanted 29436 found 29433
Ignoring transid failure
leaf parent key incorrect 2853827608576
Couldn't setup device tree
ERROR: unable to open /dev/mapper/xzy

# btrfs insp dump-t -t 2853827608576 /dev/mapper/xzy
btrfs-progs v5.11
parent transid verify failed on 2853827608576 wanted 29436 found 29433
parent transid verify failed on 2853827608576 wanted 29436 found 29433
parent transid verify failed on 2853827608576 wanted 29436 found 29433
Ignoring transid failure
leaf parent key incorrect 2853827608576
WARNING: could not setup extent tree, skipping it
parent transid verify failed on 2853827608576 wanted 29436 found 29433
Ignoring transid failure
leaf parent key incorrect 2853827608576
Couldn't setup device tree
ERROR: unable to open /dev/mapper/xzy
That does not look promising. I don't know whether a read-write mount
with usebackuproot will recover, or end up with problems.

Options:

a. btrfs check --repair
This probably fails on the same problem, it can't setup the extent tree.

b. btrfs check --init-extent-tree
This is a heavy hammer, it might succeed, but takes a long time. On 5T
it might take double digit hours or even single digit days. It's
generally faster to just wipe the drive and restore from backups than
use init-extent-tree (I understand this *is* your backup).

c. Setup an overlay file on device mapper, to redirect the writes from
a read-write mount with usebackup root. I think it's sufficient to
just mount, optionally write some files (empty or not), and umount.
Then do a btrfs check to see if the current tree is healthy.
https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID#Making_the_harddisks_read-only_using_an_overlay_file

That guide is a bit complex to deal with many drives with mdadm raid,
so you can simplify it for just one drive. The gist is no writes go to
the drive itself, it's treated as read-only by device-mapper (in fact
you can optionally add a pre-step with the blockdev command and
--setro to make sure the entire drive is read-only; just make sure to
make it rw once you're done testing). All the writes with this overlay
go into a loop mounted file which you intentionally just throw away
after testing.

d. Just skip the testing and try usebackuproot with a read-write
mount. It might make things worse, but at least it's fast to test. If
it messes things up, you'll have to recreate this backup from scratch.

As for how to prevent this? I'm not sure. About the best we can do is
disable the drive write cache with a udev rule, and/or raid1 with
another make/model drive, and let Btrfs detect occasional corruption
and self heal from the good copy. Another obvious way to avoid the
problem is, stop having power failures, crashes, and accidental USB
cable disconnections :)

It's not any one thing that's the problem. It's a sequence of problems
happening in just the right (or wrong) order that causes the problem.
Bugs + mistake + bad luck = problem.

-- 
Chris Murphy
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help