Re: btrfs cannot be mounted or checked

From: Qu Wenruo <hidden>
Date: 2021-07-14 08:59:14


On 2021/7/14 下午4:49, Zhenyu Wu wrote:

sorry for late:(

I found <https://bbs.archlinux.org/viewtopic.php?id=233724> looks same
as my situation. But in my computer (boot from live usb) `btrfs check
--init-extent-tree` output a lot of non-ascii character (maybe because
ansi escape code mess the terminal)
after several days it outputs `7/7`and `killed`. The solution looks failed.

I'm sorry because my live usb don't have smartctl :(

$ hdparm -W0 /dev/sda
/dev/sda:
  setting drive write-caching to 0 (off)
  write-caching =  0 (off)

But now the btrfs partition still cannot be mounted.

when I try to mount it with `usebackuproot`, it will output the same
error message. And dmesg will output

[250062.064785] BTRFS warning (device sda2): 'usebackuproot' is
deprecated, use 'rescue=usebackuproot' instead
[250062.064788] BTRFS info (device sda2): trying to use backup root at
mount time
[250062.064789] BTRFS info (device sda2): disk space caching is enabled
[250062.064790] BTRFS info (device sda2): has skinny extents
[250062.208403] BTRFS info (device sda2): bdev /dev/sda2 errs: wr 0,
rd 0, flush 0, corrupt 5, gen 0
[250062.277045] BTRFS critical (device sda2): corrupt leaf: root=2
block=273006592 slot=17 bg_start=1104150528 bg_len=1073741824, invalid
block group used, have 1073754112 expect [0, 1073741824)

Looks like a bad extent tree re-initialization, a bug in btrfs-progs then.

For now, you can try to mount with "ro,rescue=ibadroots" to see if it 
can be mounted RO, then rescue your data.

Thanks,
Qu

[250062.277048] BTRFS error (device sda2): block=273006592 read time
tree block corruption detected
[250062.291924] BTRFS critical (device sda2): corrupt leaf: root=2
block=273006592 slot=17 bg_start=1104150528 bg_len=1073741824, invalid
block group used, have 1073754112 expect [0, 1073741824)
[250062.291927] BTRFS error (device sda2): block=273006592 read time
tree block corruption detected
[250062.291943] BTRFS error (device sda2): failed to read block groups: -5
[250062.292897] BTRFS error (device sda2): open_ctree failed

If don't usebackuproot, dmesg will output the same log except the first 2 lines.

Now btrfs check can check this partition:

$ btrfs check /dev/sda2 2>&1|tee check.txt
# see attachment

Does my disk have any hope to be rescued?
thanks!

On 7/11/21, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
> On 2021/7/11 下午7:37, Forza wrote:
>>
>>
>> On 2021-07-11 10:59, Zhenyu Wu wrote:
>>> Sorry for my disturbance.
>>> After a dirty reboot because of a computer crash, my btrfs partition
>>> cannot be mounted. The same thing happened before, but now `btrfs
>>> rescue zero-log` cannot work.
>>> ```
>>> $ uname -r
>>> 5.10.27-gentoo-x86_64
>>> $ btrfs rescue zero-log /dev/sda2
>>> Clearing log on /dev/sda2, previous log_root 0, level 0
>>> $ mount /dev/sda2 /mnt/gentoo
>>> mount: /mnt/gentoo: wrong fs type, bad option, bad superblock on
>>> /dev/sda2, missing codepage or helper program, or other error.
>>> $ btrfs check /dev/sda2
>>> parent transid verify failed on 34308096 wanted 962175 found 961764
>>> parent transid verify failed on 34308096 wanted 962175 found 961764
>>> parent transid verify failed on 34308096 wanted 962175 found 961764
>>> Ignoring transid failure
>>> leaf parent key incorrect 34308096
>>> ERROR: failed to read block groups: Operation not permitted
>>> ERROR: cannot open file system
>>> $ dmesg 2>&1|tee dmesg.txt
>>> # see attachment
>>> ```
>>> Like `mount -o ro,usebackuproot` cannot work, too.
>>>
>>> Thanks for any help!
>>>
>>
>>
>> Hi!
>>
>> Parent transid failed is hard to recover from, as mentioned on
>> https://btrfs.wiki.kernel.org/index.php/FAQ#How_do_I_recover_from_a_.22parent_transid_verify_failed.22_error.3F
>>
>>
>> I see you have "corrupt 5" sectors in dmesg. Is your disk healthy? You
>> can check with "smartctl -x /dev/sda" to determine the health.
>>
>> One way of avoiding this error is to disable write-cache. Parent transid
>> failed can happen when the disk re-orders writes in its write cache
>> before flushing to disk. This violates barriers, but it is unfortately
>> common. If you have a crash, SATA bus reset or other issues, unwritten
>> content is lost. The problem here is the re-ordering. The superblock is
>> written out before other metadata (which is now lost due to the crash).
>
> To be extra accurate, all filesysmtems have taken the re-order into
> consideration.
> Thus we have flush (or called barrier) command to force the disk to
> write all its cache back to disk or at least non-volatile cache.
>
> Combined with mandatory metadata CoW, it means, no matter what the disk
> re-order or not, we should only see either the newer data after the
> flush, or the older data before the flush.
>
> But unfortunately, hardware is unreliable, sometimes even lies about its
> flush command.
> Thus it's possible some disks, especially some cheap RAID cards, tend to
> just ignore such flush commands, thus leaves the data corrupted after a
> power loss.
>
> Thanks,
> Qu
>
>>
>> You disable write cache with "hdparm -W0 /dev/sda". It might be worth
>> adding this to a cron-job every 5 minutes or so, as the setting is not
>> persistent and can get reset if the disk looses power, goes to sleep,
>> etc.
>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help