Re: csum failed, bad tree, block, IO failures. Is my drive dead or has my BTRFS broke itself?
From: David Sterba <hidden>
Date: 2021-10-18 10:09:17
From: David Sterba <hidden>
Date: 2021-10-18 10:09:17
On Sun, Oct 17, 2021 at 08:00:59AM +0800, Qu Wenruo wrote:
On 2021/10/17 04:45, James Harvey wrote:quoted
Check hasn't done yet, but it's spit out about 1700 messages (tmux won't let me scroll up futher) that all look like this:Yeah, this means quite a lot of metadata are filled with garbage. I'm not sure why, but it doesn't like to be caused by btrfs itself.
Agreed, this amount of garbage would be detected by other means (mismatching csums while the system is still in use or by pre-write/post-read tree checker). It's not bitflips, there are too many changes eg. in the bogus block offsets. Analyzing the actual data left on disk for some known pattern could at least give some hint what it was, eg. strings, file headers or raw pointers. Besides that a manual system check could prevent that in the future, so check cables, possible overheating, up to date kernel/firmware (in case it would be cause by other subsystems).