Re: [PATCH v5 00/17] fs: introduce new writeback error reporting and convert... | linux-ext4

[PATCH v5 00/17] fs: introduce new writeback error reporting and convert ext2 and ext4 to use it · Jeff Layton <hidden> · 2017-05-31
[PATCH v5 02/17] fs: new infrastructure for writeback error handling and reporting · Jeff Layton <hidden> · 2017-05-31
[PATCH v5 03/17] mm: tracepoints for writeback error events · Jeff Layton <hidden> · 2017-05-31
[PATCH v5 05/17] Documentation: flesh out the section in vfs.txt on storing and reporting writeback errors · Jeff Layton <hidden> · 2017-05-31
[PATCH v5 06/17] fs: adapt sync_file_range to new reporting infrastructure · Jeff Layton <hidden> · 2017-05-31
[PATCH v5 08/17] dax: set errors in mapping when writeback fails · Jeff Layton <hidden> · 2017-05-31
Re: [PATCH v5 08/17] dax: set errors in mapping when writeback fails · Ross Zwisler <hidden> · 2017-06-06
Re: [PATCH v5 08/17] dax: set errors in mapping when writeback fails · Jeff Layton <hidden> · 2017-06-06
[PATCH v5 09/17] block: convert to errseq_t based writeback error tracking · Jeff Layton <hidden> · 2017-05-31
[PATCH v5 10/17] block: add sync_blockdev_since and sync_filesystem_since · Jeff Layton <hidden> · 2017-05-31
[PATCH v5 12/17] fs: allow __generic_file_fsync to support both flavors of error reporting · Jeff Layton <hidden> · 2017-05-31
[PATCH v5 15/17] fs: add a write_one_page_since · Jeff Layton <hidden> · 2017-05-31
[PATCH v5 16/17] ext2: convert to errseq_t based writeback error tracking · Jeff Layton <hidden> · 2017-05-31
[PATCH v5 17/17] fs: convert ext2 to use write_one_page_since · Jeff Layton <hidden> · 2017-05-31
[PATCH v5 01/17] lib: add errseq_t type and infrastructure for handling it · Jeff Layton <hidden> · 2017-05-31
[PATCH v5 04/17] fs: add a new fstype flag to indicate how writeback errors are tracked · Jeff Layton <hidden> · 2017-05-31
[PATCH v5 14/17] ext4: convert to errseq_t based error tracking · Jeff Layton <hidden> · 2017-05-31
[PATCH v5 11/17] fs: add f_md_wb_err field to struct file for tracking metadata errors · Jeff Layton <hidden> · 2017-05-31
[PATCH v5 13/17] jbd2: conditionally handle errors using errseq_t based on FS_WB_ERRSEQ flag · Jeff Layton <hidden> · 2017-05-31
[PATCH v5 07/17] mm: add filemap_fdatawait_range_since and filemap_write_and_wait_range_since · Jeff Layton <hidden> · 2017-05-31
Re: [PATCH v5 00/17] fs: introduce new writeback error reporting and convert ext2 and ext4 to use it · Andrew Morton <akpm@linux-foundation.org> · 2017-05-31
Re: [PATCH v5 00/17] fs: introduce new writeback error reporting and convert ext2 and ext4 to use it · Jeff Layton <hidden> · 2017-05-31
Re: [PATCH v5 00/17] fs: introduce new writeback error reporting and convert ext2 and ext4 to use it · Andrew Morton <akpm@linux-foundation.org> · 2017-05-31
Re: [PATCH v5 00/17] fs: introduce new writeback error reporting and convert ext2 and ext4 to use it · Jeff Layton <hidden> · 2017-05-31
Re: [PATCH v5 00/17] fs: introduce new writeback error reporting and convert ext2 and ext4 to use it · Ross Zwisler <hidden> · 2017-06-02
Re: [PATCH v5 00/17] fs: introduce new writeback error reporting and convert ext2 and ext4 to use it · Jeff Layton <hidden> · 2017-06-02

Re: [PATCH v5 00/17] fs: introduce new writeback error reporting and convert ext2 and ext4 to use it

From: Jeff Layton <hidden>
Date: 2017-05-31 22:01:10
Also in: linux-block, linux-fsdevel, lkml

On Wed, 2017-05-31 at 14:37 -0700, Andrew Morton wrote:

On Wed, 31 May 2017 17:31:49 -0400 Jeff Layton [off-list ref] wrote:

quoted

On Wed, 2017-05-31 at 13:27 -0700, Andrew Morton wrote:

quoted

On Wed, 31 May 2017 08:45:23 -0400 Jeff Layton [off-list ref] wrote:

quoted

This is v5 of the patchset to improve how we're tracking and reporting
errors that occur during pagecache writeback.

I'm curious to know how you've been testing this?
 Is that testing
strong enough for us to be confident that all nature of I/O errors
will be reported to userspace?

That's a tall order. This is a difficult thing to test as these sorts of
errors are pretty rare by nature.

I have an xfstest that I posted just after this set that demonstrates
that it works correctly, at least on ext2/3/4 when run by the ext4
driver (ext2 legacy driver reports too many errors currently). I had
btrfs and xfs working on that test too in an earlier incarnation of this
set, so I think we can fix this in them as well without too much
difficulty.

I'm happy to run other tests if someone wants to suggest them.

Now, all that said, I don't think this will make things any worse than
they are today as far as reporting errors properly to userland goes.
It's rather easy for an incidental synchronous writeback request from an
internal caller to clear the AS_* flags today. This will at least ensure
that we're reporting errors since a well-defined point in time when you
call fsync.

Were you using error injection of some form?  If so, how was that all
set up?

Yes, it uses dm-error for fault injection.

The test basically does:

1) set up a dm-error device in a working configuration

2) build a scratch filesystem on it, with the log on a different device
in some fashion so metadata writeback will still succeed.

3) open the same file several times

4) flip dm-error device to non-working mode

5) write to each fd

6) fsync each fd

...do you get back an error on each fsync?

It then does a bit more to make sure they're cleared afterward as you'd
expect. That works for most block device based filesystems. I also have
a second xfstest that opens a block device and does the same basic
thing. That also works correctly with this patch series.

I still need to come up with a way to simulate errors on other fs'
though. We may need to plumb in some kernel-level fault injection on
some fs' to do that correctly. Suggestions welcome there.

With this series though, the idea is to convert one filesystem at a
time, so I think that should help mitigate some of the risk.

-- 
Jeff Layton [off-list ref]

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help