Thread (24 messages) 24 messages, 3 authors, 2015-08-03

Re: [PATCH 1/6] libata: Do not retry commands with valid autosense

From: Hannes Reinecke <hare@suse.de>
Date: 2015-08-03 18:21:23
Also in: linux-scsi

On 08/03/2015 07:01 PM, Tejun Heo wrote:
Hello, Hannes.

On Mon, Aug 03, 2015 at 06:47:46PM +0200, Hannes Reinecke wrote:
quoted
At the moment NCQ autosense is mostly used to provide the host with more
details for a failed I/O. The typical case here is (no small surprise)
ZAC disks, which use autosense to inform the host about
a malformed I/O.

It is _not_ being used as a replacement for existing error behaviour,
(ie link errors are not being signalled with that; how could they
if there is no link?); in fact, during testing I"ve seen both, autosense
Hmmm?  Devices can report link error via TF bits and you're bypassing
TF analysis completley if sense data is present.
But sense data will never be present for a link error ... and I'm not
disabling that.
quoted
I/O failures and normal I/O failures for which autosense is
not set, and the normal error handling kicks in.

It's not that I've disable the original error handler completely,
it's only bypassed for I/O failure where a sense code is provided.
And the drive surely knows which error occurs, so we'd be daft not be
using that.
The patches are altering EH actions in a very subtle way depending on
*how* an error is reported, not *what* is reported, which is a pretty
silly thing to do.  It makes things a lot more confusing to follow and
predict.  I really don't think this is an acceptable behavior.
But if we were judging on _what_ is being reported we would have to know
which  sense code the drive will be reporting, in effect carrying
a massive table of possible sense codes, and map this to the actions.
Do you really want to go that way?
quoted
So I think disabling autosense completely is a bit extreme...
Please restructure the feature so that it doesn't interfere with the
usual EH behavior.  e.g. leave the EH actions alone unless explicitly
necessary but report detailed error information upwards.  If the extra
error information can be helpful in determining what EH actions to
take, factoring in that information can be helpful too but I'm not too
convinced that'd make a huge difference.
It does if we can avoid the retry on the libata layer.
Also, please consider that ATA_QCFLAG_SENSE_VALID handling assumes
that the reporting device is an ATAPI device and the command in
question is not a regular IO one.  That's why EH ignores AC_ERR_DEV or
AC_ERR_OTHER if ATA_QCFLAG_SENSE_VALID is set.  This doesn't work out
for the new ATA usage at all.
I've just mapped onto ATA_QCFLAGS_SENSE_VALID as I've found this to
reasonably close to what I've been needing.
If that's the wrong choice of course I can modify this.
For now, I've reverted the changes as this is actively detrimental.
Oh. So you've tested this on a device with autosense enabled?
I haven't seen any negative effects with this patchset, autosense
enabled or not.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help