Re: end to end error recovery musings

From: Andreas Dilger <hidden>
Date: 2007-02-24 00:37:23
Also in: linux-fsdevel, linux-raid, linux-scsi

On Feb 23, 2007  16:03 -0800, H. Peter Anvin wrote:

Ric Wheeler wrote:

quoted

  (1) read-ahead often means that we will  retry every bad sector at 
least twice from the file system level. The first time, the fs read 
ahead request triggers a speculative read that includes the bad sector 
(triggering the error handling mechanisms) right before the real 
application triggers a read does the same thing.  Not sure what the 
answer is here since read-ahead is obviously a huge win in the normal case.

Probably the only sane thing to do is to remember the bad sectors and 
avoid attempting reading them; that would mean marking "automatic" 
versus "explicitly requested" requests to determine whether or not to 
filter them against a list of discovered bad blocks.

And clearing this list when the sector is overwritten, as it will almost
certainly be relocated at the disk level.  For that matter, a huge win
would be to have the MD RAID layer rewrite only the bad sector (in hopes
of the disk relocating it) instead of failing the whiole disk.  Otherwise,
a few read errors on different disks in a RAID set can take the whole
system offline.  Apologies if this is already done in recent kernels...

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help