Re: ext4 file system corruption with v4.19.3 / v4.19.4
From: Andrey Melnikov <hidden>
Date: 2018-12-05 12:58:24
Also in:
lkml
пн, 3 дек. 2018 г. в 01:11, Rainer Fiebig [off-list ref]:
Am 02.12.18 um 21:19 schrieb Andrey Melnikov:quoted
чт, 29 нояб. 2018 г. в 01:08, Rainer Fiebig [off-list ref]:quoted
Am 28.11.18 um 22:13 schrieb Andrey Melnikov:quoted
ср, 28 нояб. 2018 г. в 18:55, Rainer Fiebig [off-list ref]:quoted
Am Mittwoch, 28. November 2018, 13:02:56 schrieb Andrey Jr. Melnikov:quoted
In gmane.comp.file-systems.ext4 Theodore Y. Ts'o [off-list ref] wrote:quoted
On Wed, Nov 28, 2018 at 03:16:33AM +0300, Andrey Jr. Melnikov wrote:quoted
Corrupted inodes - always directory, not touched at least year or more for writing. Something wrong when updating atime?We're not sure. The frustrating thing is that it's not reproducing for me. I run extensive regression tests, and I'm using 4.19 on my development laptop without notcing any problems. If I could reproduce it, I could debug it, but since I can't, I need to rely on those who are seeing the problem to help pinpoint the problem.My workstation hit this bug every time after boot. If you have an idea - I may test it.quoted
I'm trying to figure out common factors from those people who are reporting problems. (a) What distribution are you running (it appears that many people reporting problems are running Ubuntu, but this may be a sampling issue; lots of people run Ubuntu)? (For the record, I'm using Debian Testing.)Debian sid but self-build kernel from ubuntu mainline-ppa.You could try a vanilla 4.19.5 from https://www.kernel.org/ and compile it with your current .config.mainline-ppa use vanilla kernel. Patches only adds debian specific build infrastructure.quoted
If you still see the errors, at least the Ubuntu-kernel could be ruled out. In addition, if you still see the errors: - backup your .config in a *different* folder (so that you can later re-use it) - do a "make mrproper" (deletes the .config, see above) - do a "make defconfig" - and compile the kernel with that new .configdefconfig is great - for abstract hardware in vacuum.quoted
If you still have the problem after that, you may want to learn how to bisect. ;)I'm already know how-to bisect. From kernel 2.0 era. Without git ;) This problem simply non-bisectable, when same kernel corrupt FS on my workstation but normally working on other servers. And now - FS corrupted again with disabled CONFIG_EXT4_ENCRYPTION. Great.OK, - and now we are looking forward to *your* ideas how to solve this.After four days playing games around git bisect - real winner is debian gcc-8.2.0-9. Upgrade it to 8.2.0-10 or use 7.3.0-30 version for same kernel + config - does not exhibit ext4 corruption. I think I hit this https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87859 with 8.2.0-9 version.Good that it works for you. But others used gcc 5.4.0 or 6.3.0 and were hit anyway: https://bugzilla.kernel.org/show_bug.cgi?id=201685#c165
Depends on workload pattern. 4.19.5 built with 8.2.0-10 and 7.3.0-30 - crashed after 4 hours of usage (previous build crash in 5 min). So my assumption about broken gcc is wrong.