Thread (15 messages) 15 messages, 3 authors, 2021-07-25

Re: bad file extent, some csum missing - how to check that restored volumes are error-free?

From: Qu Wenruo <hidden>
Date: 2021-07-16 01:05:16


On 2021/7/16 上午6:49, Dave T wrote:
quoted
OK, lowmem mode indeed did a much better job.

This is a very strange bug.

This means:

- The compressed extent doesn't have csum
    Which shouldn't be possible for recent kernels.

- The compressed extent exists for inode which has NODATASUM flag
    Not possible again for recent kernels.

But IIRC there are old kernels allowing such compression + nodatasum.

I guess that's the reason why you got EIO when reading it.

When we failed to find csum, we just put 0x00 as csum, and then when you
read the data, it's definitely going to cause csum mismatch and nothing
get read out.

This can be worked around by recent "rescue=idatacsums" mount option.

But to me, this really looks like some old fs, with some inodes created
by older kernels.
I'm running:
kernel version 5.12.15-arch1-1 (linux@archlinux)

I've been running arch + btrfs since 2014. I keep arch linux fully
updated. I'm running new kernels and new btrfs progs. However, I
created this filesystem around 2014.
The change that don't allow allow compression if the inode has NODATASUM
option is introduced in commit 42c16da6d684 ("btrfs: inode: Don't
compress if NODATASUM or NODATACOW set"), which is from v5.2 in 2019.

Thus such old fs indeed can be affected.
Is there an option to "update" my BTRFS filesystem? Is that even a thing?
I don't think so, but please allow me to do more testing and then I may
craft a fix in btrfs-progs to allow btrfs-check to repair such problems.

If possible I would enhance kernel to handle such existing file extents
better so that what you really need is just run "pacman -Syu" as usual,
nothing to bother.

Thanks,
Qu
I have multiple devices running on BTRFS filesystems created around
2014 to 2016. Are those all in danger of having some problems now?
BTRFS has been mostly problem-free for me since before 2014. I do
regular balance and scrubs. However, I'm getting worried about my data
now...

I hope I do not need to backup every device, recreate the filesystems,
and restore them. That would be weeks of work and I'm already
overworked... but losing data would be worse.

BTW, even my backup disks run on BTRFS filesystems that were created years ago.
quoted
quoted
Are any of these options appropriate?

-  btrfs rescue chunk-recover /dev/mapper/xyz
Definite no.

Any rescue command should only be used when some developer suggested.
Thank you for reminding me! There's a lot of bad BTRFS advice on all
the various forums, and it is easy to be influenced by it when you are
a casual user like me.

quoted
quoted
- btrfs check --repair --init-csum-tree /dev/mapper/xyz
This may solve the read error, but we will still report the NODATACSUM
problem for the compressed extent.

Have you tried to remove the NODATASUM option for those involved inodes?
https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)
says:
Note: If compression is enabled, nodatacow and nodatasum are disabled.

My mount options are:
rw,autodefrag,noatime,nodiratime,compress=lzo,space_cache,subvol=xyz

Do I understand it correctly? My compression option should already
"remove the NODATASUM".
quoted
If it's possible to remove NODATASUM for those inodes, then
--init-csum-tree should be able to solve the problem.
What do you recommend now?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help