Re: [PATCH 1/6] libata: Do not retry commands with valid autosense
From: Hannes Reinecke <hare@suse.de>
Date: 2015-08-03 18:21:23
Also in:
linux-scsi
On 08/03/2015 07:01 PM, Tejun Heo wrote:
Hello, Hannes. On Mon, Aug 03, 2015 at 06:47:46PM +0200, Hannes Reinecke wrote:quoted
At the moment NCQ autosense is mostly used to provide the host with more details for a failed I/O. The typical case here is (no small surprise) ZAC disks, which use autosense to inform the host about a malformed I/O. It is _not_ being used as a replacement for existing error behaviour, (ie link errors are not being signalled with that; how could they if there is no link?); in fact, during testing I"ve seen both, autosenseHmmm? Devices can report link error via TF bits and you're bypassing TF analysis completley if sense data is present.
But sense data will never be present for a link error ... and I'm not disabling that.
quoted
I/O failures and normal I/O failures for which autosense is not set, and the normal error handling kicks in. It's not that I've disable the original error handler completely, it's only bypassed for I/O failure where a sense code is provided. And the drive surely knows which error occurs, so we'd be daft not be using that.The patches are altering EH actions in a very subtle way depending on *how* an error is reported, not *what* is reported, which is a pretty silly thing to do. It makes things a lot more confusing to follow and predict. I really don't think this is an acceptable behavior.
But if we were judging on _what_ is being reported we would have to know which sense code the drive will be reporting, in effect carrying a massive table of possible sense codes, and map this to the actions. Do you really want to go that way?
quoted
So I think disabling autosense completely is a bit extreme...Please restructure the feature so that it doesn't interfere with the usual EH behavior. e.g. leave the EH actions alone unless explicitly necessary but report detailed error information upwards. If the extra error information can be helpful in determining what EH actions to take, factoring in that information can be helpful too but I'm not too convinced that'd make a huge difference.
It does if we can avoid the retry on the libata layer.
Also, please consider that ATA_QCFLAG_SENSE_VALID handling assumes that the reporting device is an ATAPI device and the command in question is not a regular IO one. That's why EH ignores AC_ERR_DEV or AC_ERR_OTHER if ATA_QCFLAG_SENSE_VALID is set. This doesn't work out for the new ATA usage at all.
I've just mapped onto ATA_QCFLAGS_SENSE_VALID as I've found this to reasonably close to what I've been needing. If that's the wrong choice of course I can modify this.
For now, I've reverted the changes as this is actively detrimental.
Oh. So you've tested this on a device with autosense enabled? I haven't seen any negative effects with this patchset, autosense enabled or not. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html