Re: NULL pointer dereference in ext4_ext_remove_space on 3.5.1
From: Fengguang Wu <hidden>
Date: 2012-08-17 06:01:16
Also in:
lkml
On Thu, Aug 16, 2012 at 11:25:13AM -0400, Theodore Ts'o wrote:
On Thu, Aug 16, 2012 at 07:10:51PM +0800, Fengguang Wu wrote:quoted
Here is the dmesg. BTW, it seems 3.5.0 don't have this issue.Fengguang, It sounds like you have a (at least fairly) reliable reproduction for this problem? Is it something you can share? It would be good to get
Right, it can be easily reproduced here. I'm running these writeback
performance tests:
https://github.com/fengguang/writeback-tests
Which is basically doing N parallel dd writes to JBOD/RAID arrays on
various filesystems. It seems that the RAID test can reliably trigger
the problem.
this into our test suites, since it was _not_ something that was caught by xfstests, apparently. Can you see if this patch addresses it? (The first two patch hunks are the same debugging additions I had posted before.) It looks like the responsible commit is 968dee7722: "ext4: fix hole punch failure when depth is greater than 0". I had thought this patch was low risk if you weren't using the new punch ioctl, but it turns out it did make a critical change in the non-punch (i.e., truncate) code path, which is what the addition of "i = 0;" in the patch below addresses.
Yes, I'm sure the patch fixed the bug. With the fix, the writeback tests have run flawlessly for a dozen hours without any problem. Thanks, Fengguang