Re: libata error handling
From: Mike Anderson <hidden>
Date: 2005-08-19 20:30:33
Also in:
linux-scsi, lkml
Luben Tuikov [off-list ref] wrote:
On 08/19/05 15:38, Patrick Mansfield wrote: The eh_timed_out + eh_strategy_handler is actually pretty perfect, and _complete_, for any application and purpose in recovering a LU/device/host (in that order ;-) ).quoted
The two problems I see with the hook are: It calls the driver in interrupt context, so the called function can't sleep.Consider this: When SCSI Core told you that the command timed out, A) it has already finished, B) it hasn't already finished. In case A, you can return EH_HANDLED. In case B, you return EH_NOT_HANDLED, and deal with it in the eh_strategy_handler. (Hint: you can still "finish" it from there.)
But dealing with it in the eh_strategy_handler means that you may be stopping all IO on the host instance as the first lun returns EH_NOT_HANDLED for LUN based canceling. I still think we can do better here for an LLDD that cannot execute a cancel in interrupt context. Having a error handler that works is a plus, I would hope that some factoring would happen over time from the eh_strategy_handler to some transport (or other factor point) error handler. I would think from a testing, support, and block level multipath predictability sharing code would be a good goal. -andmike -- Michael Anderson andmike@us.ibm.com