Re: On the issue of direct I/O and csum warnings
From: Martin Raiber <hidden>
Date: 2021-07-23 18:45:44
On 23.07.2021 16:55 Jonas Aaberg wrote:
Hi, I use btrfs on dm-crypt. About two months ago, I started to get: -- BTRFS warning (device dm-0): csum failed root 257 ino 1068852 off 25690112 csum 0xa27faf9a expected csum 0x4c266278 mirror 1 BTRFS error (device dm-0): bdev /dev/mapper/disk0 errs: wr 0, rd 0, flush 0, corrupt 349, gen 0 -- kind of warning/errors on my laptop. I went a bought a new NVME disk because I'm rather found of my data, eventhough most is backup-ed up. A week later, I started to get the same kind of warning/error message on my new NVME. After half a day of memtest86, resulted in no memory errors found, I gave up on my otherwise stable laptop and started to use an old laptop that I've been to lazy to sell instead while looking out for a decent pre-owned newer laptop. Now I'm just about to install and move over to a newly bought laptop, when today my old laptop started to show the same warning/errors. My old laptop does not share a single part with the laptop which I previous got the "checksum failure" warnings on. Therefore I have a hard time to believe that I've gotten the same hardware failure twice. Then I found: <https://btrfs.wiki.kernel.org/index.php/Gotchas> and "Direct I/O and CRCs". Which I believe is what I've ran into. One of the affect files is a log file from syncthing on both computers.
I wouldn't be certain about the conclusion that it is the direct I/O csum issue. Are you sure syncthing is writing to logs via direct I/O? That would be bad e.g. because it disables btrfs compression and log files compress really well. So I'd say report additional information like kernel version (and if it is a vanilla kernel), how your btrfs is setup (metadata RAID1), etc.
I have just one humble request, please do something about this checksum error message. Just add printk with a link to: <https://btrfs.wiki.kernel.org/index.php/Gotchas> and the issue of "Direct I/O and CRCs".
The problem is nothing can be done without impacting performance and direct I/O is used for performance. IMO it should be disabled by default (i.e. it just pretends to do direct I/O like ZFSOnLinux) and be able to be enabled via mount option.
Maybe update the wiki with: `find <mountpoint> -inum <ino-number-from-warning-message>` would be a helpful as well.
btrfs inspect-internal inode-resolve <ino-number-from-warning-message> <fs> is faster.