Re: ata_eh_link_autopsy: Bug?
From: Mark Lord <hidden>
Date: 2012-05-01 23:52:34
On 12-05-01 07:48 PM, Mark Lord wrote:
On 12-05-01 05:58 PM, Tejun Heo wrote:quoted
On Tue, May 01, 2012 at 04:27:00PM -0400, Mark Lord wrote:quoted
MMmm.. even that isn't good enough, because the first ATA_QCFLAG_IO test bypasses the rest of that logic and triggers unconditional retries. Ugh.Hmmm... the unconditional retry on ATA_QCFLAG_IO is intenttional so that known good requests from FS are guaranteed to be retried no matter how whacky the underlying device is. I'm not sure whether that was a good decision tho. Maybe we should trust the hardware a bit more. So, I'm not necessarily against changing it.With multi-terabyte drives being commonplace now, bad sectors seem to be a more frequent occurrence than I can remember from the past. And when libata stumbles across a bad sector, it literally hangs the machine for _minutes_ doing retries. I have never seen a retry make any difference whatsoever on a bad sector read. New, old, or ancient hardware.
And as a reminder to anyone else listening in, it's easier than you might think to test failure paths like this. Here, I keep a few 300GB drives around just for that purpose, and use "hdparm --make-bad-sector" on them to inject media errors at specific places on the disk or filesystem. Cheers