Thread (29 messages) 29 messages, 8 authors, 2012-09-17

Re: NULL pointer dereference in ext4_ext_remove_space on 3.5.1

From: Fengguang Wu <hidden>
Date: 2012-08-17 06:01:16
Also in: lkml

On Thu, Aug 16, 2012 at 11:25:13AM -0400, Theodore Ts'o wrote:
On Thu, Aug 16, 2012 at 07:10:51PM +0800, Fengguang Wu wrote:
quoted
Here is the dmesg. BTW, it seems 3.5.0 don't have this issue.
Fengguang,

It sounds like you have a (at least fairly) reliable reproduction for
this problem?  Is it something you can share?  It would be good to get
Right, it can be easily reproduced here. I'm running these writeback
performance tests:

        https://github.com/fengguang/writeback-tests

Which is basically doing N parallel dd writes to JBOD/RAID arrays on
various filesystems. It seems that the RAID test can reliably trigger
the problem.
this into our test suites, since it was _not_ something that was
caught by xfstests, apparently.

Can you see if this patch addresses it?  (The first two patch hunks
are the same debugging additions I had posted before.)

It looks like the responsible commit is 968dee7722: "ext4: fix hole
punch failure when depth is greater than 0".  I had thought this patch
was low risk if you weren't using the new punch ioctl, but it turns
out it did make a critical change in the non-punch (i.e., truncate)
code path, which is what the addition of "i = 0;" in the patch below
addresses.
Yes, I'm sure the patch fixed the bug. With the fix, the writeback
tests have run flawlessly for a dozen hours without any problem.

Thanks,
Fengguang
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help