Re: end to end error recovery musings

From: Theodore Tso <tytso@mit.edu>
Date: 2007-02-24 02:32:29
Also in: linux-fsdevel, linux-raid, linux-scsi

On Fri, Feb 23, 2007 at 05:37:23PM -0700, Andreas Dilger wrote:

quoted

Probably the only sane thing to do is to remember the bad sectors and 
avoid attempting reading them; that would mean marking "automatic" 
versus "explicitly requested" requests to determine whether or not to 
filter them against a list of discovered bad blocks.

And clearing this list when the sector is overwritten, as it will almost
certainly be relocated at the disk level.  For that matter, a huge win
would be to have the MD RAID layer rewrite only the bad sector (in hopes
of the disk relocating it) instead of failing the whiole disk.  Otherwise,
a few read errors on different disks in a RAID set can take the whole
system offline.  Apologies if this is already done in recent kernels...

And having a way of making this list available to both the filesystem
and to a userspace utility, so they can more easily deal with doing a
forced rewrite of the bad sector, after determining which file is
involved and perhaps doing something intelligent (up to and including
automatically requesting a backup system to fetch a backup version of
the file, and if it can be determined that the file shouldn't have
been changed since the last backup, automatically fixing up the
corrupted data block :-).

						- Ted

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help