Thread (17 messages) 17 messages, 8 authors, 2012-10-29

Re: Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?)

From: Jannis Achstetter <hidden>
Date: 2012-10-24 19:25:04
Also in: lkml

Possibly related (same subject, not in this thread)

Am 24.10.2012 00:19, schrieb Theodore Ts'o:
[...]
The reason why the problem happens rarely is that the effect of the
buggy commit is that if the journal's starting block is zero, we fail
to truncate the journal when we unmount the file system.  This can
happen if we mount and then unmount the file system fairly quickly,
before the log has a chance to wrap.  After the first time this has
happened, it's not a disaster, since when we replay the journal, we'll
just replay some extra transactions.  But if this happens twice, the
oldest valid transaction will still not have gotten updated, but some
of the newer transactions from the last mount session will have gotten
written by the very latest transacitons, and when we then try to do
the extra transaction replays, the metadata blocks can end up getting
very scrambled indeed.
[...]
As a "normal linux user" I'm interested in the practical things to do
now to avoid data loss. I'm running several systems with 3.6.2 and ext4.
Fearing loss of data:
- Is there a way to see whether the journal of a specific partition has
been wrapped (since mounting) so that umounting and mounting (or doing a
reboot to downgrade the kernel) is safe?
- Is there a way to "force" a journal-wrap? Run any
filesystem-benchmark? Which one with what parameters? Or is it unwise
since I might even further corrupt data if I hit the case already?
- Is it wise to umount now and run e2fsck or might I corrupt my files
just by umounting now if the journal hasn't wrapped yet?
- How do you define "fairly quickly"? Of course servers run 24/7 but I
might be using my PC 2-5 hrs a day... Is that a "reboot to soon after
booting"?
- Any more advice you can give to the ordinary user to avoid
fs-corruption? Don't shut down machines for some days? Better down- or
upgrade the kernel?

Best regards,
	Jannis Achstetter

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help