Thread (25 messages) 25 messages, 7 authors, 2018-12-13

Re: ext4 file system corruption with v4.19.3 / v4.19.4

From: Andrey Melnikov <hidden>
Date: 2018-12-05 12:58:24
Also in: lkml

пн, 3 дек. 2018 г. в 01:11, Rainer Fiebig [off-list ref]:
Am 02.12.18 um 21:19 schrieb Andrey Melnikov:
quoted
чт, 29 нояб. 2018 г. в 01:08, Rainer Fiebig [off-list ref]:
quoted
Am 28.11.18 um 22:13 schrieb Andrey Melnikov:
quoted
ср, 28 нояб. 2018 г. в 18:55, Rainer Fiebig [off-list ref]:
quoted
Am Mittwoch, 28. November 2018, 13:02:56 schrieb Andrey Jr. Melnikov:
quoted
In gmane.comp.file-systems.ext4 Theodore Y. Ts'o [off-list ref] wrote:
quoted
On Wed, Nov 28, 2018 at 03:16:33AM +0300, Andrey Jr. Melnikov wrote:
quoted
Corrupted inodes - always directory, not touched at least year or
more for writing. Something wrong when updating atime?
We're not sure.  The frustrating thing is that it's not reproducing
for me.  I run extensive regression tests, and I'm using 4.19 on my
development laptop without notcing any problems.  If I could reproduce
it, I could debug it, but since I can't, I need to rely on those who
are seeing the problem to help pinpoint the problem.
My workstation hit this bug every time after boot. If you have an idea - I
may test it.
quoted
I'm trying to figure out common factors from those people who are
reporting problems.

(a) What distribution are you running (it appears that many people
reporting problems are running Ubuntu, but this may be a sampling
issue; lots of people run Ubuntu)?  (For the record, I'm using Debian
Testing.)
Debian sid but self-build kernel from ubuntu mainline-ppa.
You could try a vanilla 4.19.5 from https://www.kernel.org/
and compile it with your current .config.
mainline-ppa use vanilla kernel. Patches only adds debian specific
build infrastructure.
quoted
If you still see the errors, at least the Ubuntu-kernel could be ruled out.

In addition, if you still see the errors:

- backup your .config in a *different* folder (so that you can later re-use
it)
- do a "make mrproper" (deletes the .config, see above)
- do a "make defconfig"
- and compile the kernel with that new .config
defconfig is great - for abstract hardware in vacuum.
quoted
If you still have the problem after that, you may want to learn how to bisect.
;)
I'm already know how-to bisect. From kernel 2.0 era. Without git ;)

This problem simply non-bisectable, when same kernel corrupt FS on my
workstation but normally working on other servers.
And now - FS corrupted again with disabled CONFIG_EXT4_ENCRYPTION. Great.
OK, - and now we are looking forward to *your* ideas how to solve this.
After four days playing games around git bisect - real winner is
debian gcc-8.2.0-9. Upgrade it to 8.2.0-10 or use 7.3.0-30 version for
same kernel + config - does not exhibit ext4 corruption.

I think I hit this https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87859
with 8.2.0-9 version.
Good that it works for you. But others used gcc 5.4.0 or 6.3.0 and were
hit anyway: https://bugzilla.kernel.org/show_bug.cgi?id=201685#c165
Depends on workload pattern. 4.19.5 built with 8.2.0-10 and 7.3.0-30 -
crashed after 4 hours of usage (previous build crash in 5 min).
So my assumption about broken gcc is wrong.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help