Re: end to end error recovery musings
From: Jeff Garzik <hidden>
Date: 2007-02-26 22:46:59
Also in:
linux-fsdevel, linux-ide, linux-scsi
From: Jeff Garzik <hidden>
Date: 2007-02-26 22:46:59
Also in:
linux-fsdevel, linux-ide, linux-scsi
Theodore Tso wrote:
Can someone with knowledge of current disk drive behavior confirm that for all drives that support bad block sparing, if an attempt to write to a particular spot on disk results in an error due to bad media at that spot, the disk drive will automatically rewrite the sector to a sector in its spare pool, and automatically redirect that sector to the new location. I believe this should be always true, so presumably with all modern disk drives a write error should mean something very serious has happend.
This is what will /probably/ happen. The drive should indeed find a spare sector and remap it, if the write attempt encounters a bad spot on the media. However, with a large enough write, large enough bad-spot-on-media, and a firmware programmed to never take more than X seconds to complete their enterprise customers' I/O, it might just fail. IMO, somewhere in the kernel, when we receive a read-op or write-op media error, we should immediately try to plaster that area with small writes. Sure, if it's a read-op you lost data, but this method will maximize the chance that you can refresh/reuse the logical sectors in question. Jeff