Thread (12 messages) 12 messages, 4 authors, 2012-08-27

Re: kernel BUG at fs/btrfs/extent-tree.c:5038 (linux 3.4.7)

From: Stefan Behrens <hidden>
Date: 2012-08-08 16:06:54

On Wed, 8 Aug 2012 16:45:57 +0200, David Sterba wrote:
On Sun, Aug 05, 2012 at 04:11:47PM +0200, Olivier Bonvalet wrote:
quoted
Aug  5 16:10:12 backup2 kernel: [   58.674758] parent transid verify failed on 615015833600 wanted 110423 found 110424
1st mirror fails verify_parent_transid().
quoted
Aug  5 16:10:12 backup2 kernel: [   58.675090] parent transid verify failed on 615015833600 wanted 110423 found 110424
2nd mirror fails verify_parent_transid().
quoted
Aug  5 16:10:12 backup2 kernel: [   58.675523] btrfs read error corrected: ino 1 off 615015833600 (dev /dev/mapper/vg--backupplug-backup sector 1209083504)
That's a bug. It is wrong to ignore the previous results from
verify_parent_transid() and to call repair_eb_io_failure() which
rewrites one mirror and claims to have corrected an error. But it's not
a major issue, just a misleading message in the kernel log and a disk
write operation which does not repair anything.
This looks strange, the the corrupted block belongs to metadata, I
assume you have the DUP profile, so there is a good copy that can be
used instead, the error message confirms that, but ...
quoted
Aug  5 16:10:12 backup2 kernel: [   58.675536] Failed to read block groups: -5
That's correct, because the UPTODATE flag in the extent is not set
(verify_parent_transid() clears it when it detects an error).
... ? -5 means EIO, which is returned when a block cannot be read, so
unless there's a different reason for it, this looks like a missed
oportunity to fix an error and continue.

The same error messages are present in the logs from 3.4 version.
quoted
Aug  5 16:10:12 backup2 kernel: [   58.704720] btrfs: open_ctree failed
The summary is that the block was not correctable, both mirrors had the
same old transid. The bug is that the call to repair_io_failure() should
not have been done because verify_parent_transid() indicated errors.

I'll prepare a patch for it. Changing btree_read_extent_buffer_pages()
to set ret to -EIO if verify_parent_transid() fails should fix the issue.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help