Thread (2 messages) 2 messages, 2 authors, 2017-03-13

Re: v4.7--v4.10+: ext4: repeatable inline-data oops (and fs corruption) caused by msync() of shared writable mmap (with recipe)

From: Darrick J. Wong <hidden>
Date: 2017-03-13 23:37:05
Also in: linux-fsdevel

On Mon, Mar 13, 2017 at 11:11:35PM +0000, Nick Alcock wrote:
On 13 Mar 2017, Eric Biggers spake thusly:
quoted
On Wed, Mar 01, 2017 at 11:45:52AM +0000, Nick Alcock wrote:
quoted
[Resend, after the first attempt, from my home address, failed with
 endless greylisting followed by "4.5.0 Interactive router timed out"
 from all but the lowest-priority MX for vger, and "Name server:
 bl-ckh-le.kernel.org.: host not found" for the apparently-nonexistent
 lowest-priority MX. Maybe it'll work better from here.]

I first spotted this -- or it spotted me -- back in the v4.7.x days. It
is still present in v4.10.

Here's a replication recipe, given a reasonable rootfs with a compiler
on it, and assuming a blank virtio disk on /dev/vdb:
Hi Nick, thanks for reporting this.  I've sent a patch which should fix this,
and Cc'ed you.  This actually seems to been a bug for a very long time, maybe
I'll test it. Your timing is supernatural: I was just about to mkfs all
the filesystems on my new server (a once-in-a-decade operation for me)
and was bemoaning the fact that I couldn't turn on inline_data at the
same time. Now I can! (I have good backups so can take suicidally crazy
risks).
Glad to hear you have backups!

I wouldn't turn on inline_data for files, period.  It's not as well tested
as it ought to be (clearly). :/

--D
quoted
even ever since the inline_data feature was introduced.  (I was able to
reproduce it in a 3.18 kernel, at least.)  I'm not sure why it didn't get
noticed earlier --- maybe hardly anyone ever writes to small files with mmap...
Yeah, I built my /usr/src with it and ran for weeks without hitting it:
it wasn't until I rebuilt most of a distro and hit dovecot that anything
went wrong.

I note that what I saw then was massive filesystem corruption, so
massive that not even tune2fs recognized it as being an ext4 fs
afterwards. Perhaps the thing wrote badness into the journal (possibly
including inline data scribbled over the next inode?) and replayed it
over the fs on the next boot, following which a cascade of increasing
badness ended up eating the entire fs... ah well, I guess it's hard to
know now, months after the fact (though if it's of interest, I still
have an e2image of the corrupted fs lying around!)

-- 
NULL && (void)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help