Thread (117 messages) 117 messages, 20 authors, 2006-07-07

Re: LibPATA code issues / 2.6.15.4

From: Mark Lord <hidden>
Date: 2006-02-26 14:04:24
Also in: lkml

David Greaves wrote:
Mark Lord wrote:
quoted
quoted
sdb: Current: sense key: Medium Error
    Additional sense: Unrecovered read error - auto reallocate failed
end_request: I/O error, dev sdb, sector 398283329
raid1: Disk failure on sdb2, disabling device.
        Operation continuing on 1 devices
..
quoted
The command failing above is SCSI WRITE_10, which is being
translated into ATA_CMD_WRITE_FUA_EXT by libata.

This command fails -- unrecognized by the drive in question.
But libata reports it (most incorrectly) as a "medium error",
and the drive is taken out of service from its RAID.

Bad, bad, and worse.
..
Thanks Mark

I'm glad it's a bug and not bad hardware.

I am quite concerned that the basic effect of just booting a practically
vanilla 2.6.16-rc4 like this was to fry my raid array.

Luckily it dropped 2 (of  3) disks so quickly that the event counter was
the same allowing an easy rebuild.

2.6.15 has similar issues but they seem to happen *very* infrequently by
comparison - this hit me several times during a single boot.

Should Linus (cc'ed) hold off on 2.6.16 because of this or not?
Well, no doubt whatsoever about it being a "regression",
since the FUA code is *new* in 2.6.16 (not present in 2.6.15).

The FUA code should either get fixed, or removed from 2.6.16.

Cheers
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help