Re: end to end error recovery musings
From: Ric Wheeler <hidden>
Date: 2007-02-26 15:18:22
Also in:
linux-fsdevel, linux-raid, linux-scsi
From: Ric Wheeler <hidden>
Date: 2007-02-26 15:18:22
Also in:
linux-fsdevel, linux-raid, linux-scsi
Alan wrote:
quoted
the new location. I believe this should be always true, so presumably with all modern disk drives a write error should mean something very serious has happend.Not quite that simple.
I think that write errors are normally quite serious, but there are exceptions which might be able to be worked around with retries. To Ted's point, in general, a write to a bad spot on the media will cause a remapping which should be transparent (if a bit slow) to us.
If you write a block aligned size the same size as the physical media block size maybe this is true. If you write a sector on a device with physical sector size larger than logical block size (as allowed by say ATA7) then it's less clear what happens. I don't know if the drive firmware implements multiple "tails" in this case. On a read error it is worth trying the other parts of the I/O.
I think that this is mostly true, but we also need to balance this against the need for higher levels to get a timely response. In a really large IO, a naive retry of a very large write could lead to a non-responsive system for a very large time... ric