Re: btrfs cannot be mounted or checked
From: Qu Wenruo <hidden>
Date: 2021-07-11 12:00:19
On 2021/7/11 下午7:37, Forza wrote:
On 2021-07-11 10:59, Zhenyu Wu wrote:quoted
Sorry for my disturbance. After a dirty reboot because of a computer crash, my btrfs partition cannot be mounted. The same thing happened before, but now `btrfs rescue zero-log` cannot work.$ uname -r 5.10.27-gentoo-x86_64 $ btrfs rescue zero-log /dev/sda2 Clearing log on /dev/sda2, previous log_root 0, level 0 $ mount /dev/sda2 /mnt/gentoo mount: /mnt/gentoo: wrong fs type, bad option, bad superblock on /dev/sda2, missing codepage or helper program, or other error. $ btrfs check /dev/sda2 parent transid verify failed on 34308096 wanted 962175 found 961764 parent transid verify failed on 34308096 wanted 962175 found 961764 parent transid verify failed on 34308096 wanted 962175 found 961764 Ignoring transid failure leaf parent key incorrect 34308096 ERROR: failed to read block groups: Operation not permitted ERROR: cannot open file system $ dmesg 2>&1|tee dmesg.txt # see attachmentLike `mount -o ro,usebackuproot` cannot work, too. Thanks for any help!Hi! Parent transid failed is hard to recover from, as mentioned on https://btrfs.wiki.kernel.org/index.php/FAQ#How_do_I_recover_from_a_.22parent_transid_verify_failed.22_error.3F I see you have "corrupt 5" sectors in dmesg. Is your disk healthy? You can check with "smartctl -x /dev/sda" to determine the health. One way of avoiding this error is to disable write-cache. Parent transid failed can happen when the disk re-orders writes in its write cache before flushing to disk. This violates barriers, but it is unfortately common. If you have a crash, SATA bus reset or other issues, unwritten content is lost. The problem here is the re-ordering. The superblock is written out before other metadata (which is now lost due to the crash).
To be extra accurate, all filesysmtems have taken the re-order into consideration. Thus we have flush (or called barrier) command to force the disk to write all its cache back to disk or at least non-volatile cache. Combined with mandatory metadata CoW, it means, no matter what the disk re-order or not, we should only see either the newer data after the flush, or the older data before the flush. But unfortunately, hardware is unreliable, sometimes even lies about its flush command. Thus it's possible some disks, especially some cheap RAID cards, tend to just ignore such flush commands, thus leaves the data corrupted after a power loss. Thanks, Qu
You disable write cache with "hdparm -W0 /dev/sda". It might be worth adding this to a cron-job every 5 minutes or so, as the setting is not persistent and can get reset if the disk looses power, goes to sleep, etc.