Re: Help recovering filesystem (if possible)

From: Matthew Dawson <hidden>
Date: 2021-11-18 02:57:48

On Monday, November 15, 2021 5:46:43 A.M. EST Kai Krakow wrote:

Am Mo., 15. Nov. 2021 um 02:55 Uhr schrieb Matthew Dawson

[off-list ref]:

quoted

I recently upgrade one of my machines to the 5.15.2 kernel.  on the first
reboot, I had a kernel fault during the initialization (I didn't get to
capture the printed stack trace, but I'm 99% sure it did not have BTRFS
related calls).  I then rebooted the machine back to a 5.14 kernel, but
the
BCache (writeback) cache was corrupted.  I then force started the
underlying disks, but now my BTRFS filesystem will no longer mount.  I
realize there may be missing/corrupted data, but I would like to ideally
get any data I can off the disks.

I had a similar issue lately where the system didn't reboot cleanly
(there's some issue in the BIOS or with the SSD firmware where it
would disconnect the SSD from SATA a few seconds after boot, forcing
bcache into detaching dirty caches).

Since you are seeing transaction IDs lacking behind expectations, I
think you've lost dirty writeback data from bcache. Do fix this in the
future, you should use bcache only in writearound or writethrough
mode.

Considering I started the bcache devices without the cache, I don't doubt I've 
lost writeback data and I have no doubts there will be issues.  At this point 
I'm just in data recovery, trying to get what I can.

quoted

This system involves 10 8TB disk, some are doing BCache -> LUKS -> BTRFS,
some are doing LUKS -> BTRFS.

Not LUKS here, and all my btrfs pool members are attached to a single
SSD as caching frontend.

quoted

When I try to mount the filesystem, I get the following in dmesg:
[117632.798339] BTRFS info (device dm-0): flagging fs with big metadata
feature [117632.798344] BTRFS info (device dm-0): disk space caching is
enabled [117632.798346] BTRFS info (device dm-0): has skinny extents
[117632.873186] BTRFS error (device dm-0): parent transid verify failed on
132806584614912 wanted 3240123 found 3240119

I had luck with the following steps:

* ensure that all members are attached to bcache as they should
* ensure bcache is running in writearound mode for each member
* ensure that btrfs did scan for all members

Next, I started `btrfs check` for each member disk, eventually one
would contain the needed disk structures and only showed a few errors.

I was then able to mount btrfs through that device node, open ctree
didn't fail this time. I don't remember if I used "usebackuproot" for
mount or a similar switch for "btrfs check".

I then ran `btrfs scrub` which fixed the broken metadata. Luckily, I
had only metadata corruption on the disks which had dirty writeback
cleared, and metadata runs in RAID-1 mode for me.

"btrfs check" then didn't find any errors. Reboot worked fine.

Thanks for the suggestion.  Unfortunately, all my disks report basically the 
same errors, so I wasn't able to recover my system this way.

[...]

quoted

Is there any hope in recovering this data?  Or should I give up on it at
this point and reformat?  Most of the data is backed up (or are backups
themselves), but I'd like to get what I can.

Well, I'm doing daily backups with borg - to a different technology
(no btrfs, no bcache, different system). I don't think backing up
btrfs to btrfs is a brilliant idea, especially not when both are
mounted to the same system.

I'm not quite that redundant, but the backups of things I really care about 
are actually to an off-site system.  But accessing data through a backup can be 
painful compared to hopefully just getting it out.  Also the local backups on 
the system would be nice to have, for historical purposes.

You may try my steps above. If you've found a member device which
shows fewer errors, you COULD try to repair it if mount still fails
(or try one of the recovery mount options). But you may want to ask
the experts again here.

I did try, thanks.  Unfortunately as noted above it wasn't helpful.

Hopefully someone has a different idea?  I am posting here because I feel any 
luck is going to start using more dangerous options and those usually say to 
ask the mailing list first.

Depending on how much dirty writeback you've lost in bcache, chances
may be good that one of the members has enough metadata to
successfully mount or repair the filesystem. Or at least, it's a good
start for "btrfs restore" then.

What do we learn from this?

* probably do not use bcache in writeback mode if you can avoid it
* switch bcache to writearound mode before kernel upgrades, wait for
writeback to finish
* success mounting btrfs may depend a lot on which member device you
actually mount

Thanks,
-- 
Matthew

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help