Re: [PATCH 49/51] scsi: Move eh_device_reset_handler() to use scsi_device as argument
From: Hannes Reinecke <hare@suse.de>
Date: 2021-08-17 18:09:41
On 8/17/21 6:13 PM, Steffen Maier wrote:
On 8/17/21 11:14 AM, Hannes Reinecke wrote:quoted
When resetting a device we shouldn't depend on an existing SCSI command, as this might be completed at any time. Rather we should use 'struct scsi_device' as argument for eh_device_reset_handler(). Signed-off-by: Hannes Reinecke <hare@suse.com>Acked-by: Steffen Maier <redacted> # for zfcp However, independent review comments for common code below...quoted
--- Documentation/scsi/scsi_eh.rst | 2 +- Documentation/scsi/scsi_mid_low_api.rst | 4 +--quoted
drivers/s390/scsi/zfcp_scsi.c | 4 +--quoted
drivers/scsi/scsi_error.c | 35 +++++++++++++------quoted
include/scsi/scsi_host.h | 2 +- 62 files changed, 314 insertions(+), 329 deletions(-)diff --git a/Documentation/scsi/scsi_eh.rstb/Documentation/scsi/scsi_eh.rst index e09c81a54702..23f0d09668d9 100644--- a/Documentation/scsi/scsi_eh.rst +++ b/Documentation/scsi/scsi_eh.rst@@ -214,7 +214,7 @@ considered to fail always.:: int (* eh_abort_handler)(struct scsi_cmnd *); - int (* eh_device_reset_handler)(struct scsi_cmnd *); + int (* eh_device_reset_handler)(struct scsi_device *); int (* eh_target_reset_handler)(struct scsi_target *); int (* eh_bus_reset_handler)(struct Scsi_Host *, int); int (* eh_host_reset_handler)(struct Scsi_Host *);diff --git a/Documentation/scsi/scsi_mid_low_api.rstb/Documentation/scsi/scsi_mid_low_api.rst index 0afc1b4f89af..4650c0c6a22a 100644--- a/Documentation/scsi/scsi_mid_low_api.rst +++ b/Documentation/scsi/scsi_mid_low_api.rst@@ -778,7 +778,7 @@ Details::/** * eh_device_reset_handler - issue SCSI device reset - * @scp: identifies SCSI device to be reset + * @sdev: identifies SCSI device to be reset * * Returns SUCCESS if command aborted else FAILED *@@ -791,7 +791,7 @@ Details::* * Optionally defined in: LLD **/ - int eh_device_reset_handler(struct scsi_cmnd * scp) + int eh_device_reset_handler(struct scsi_device * sdev) /**quoted
diff --git a/drivers/s390/scsi/zfcp_scsi.cb/drivers/s390/scsi/zfcp_scsi.c index 6492c3b1b12f..4fa626763bb6 100644--- a/drivers/s390/scsi/zfcp_scsi.c +++ b/drivers/s390/scsi/zfcp_scsi.c@@ -333,10 +333,8 @@ static int zfcp_scsi_task_mgmt_function(structscsi_device *sdev, u8 tm_flags) return retval; } -static int zfcp_scsi_eh_device_reset_handler(struct scsi_cmnd *scpnt) +static int zfcp_scsi_eh_device_reset_handler(struct scsi_device *sdev) { - struct scsi_device *sdev = scpnt->device; - return zfcp_scsi_task_mgmt_function(sdev, FCP_TMF_LUN_RESET); }quoted
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 1d8e2f655833..44e29558b068 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c@@ -910,7 +910,7 @@ static enum scsi_dispositionscsi_try_bus_device_reset(struct scsi_cmnd *scmd)quoted
struct scsi_host_template *hostt = scmd->device->host->hostt;quoted
if (!hostt->eh_device_reset_handler) return FAILED; - rtn = hostt->eh_device_reset_handler(scmd); + rtn = hostt->eh_device_reset_handler(scmd->device); if (rtn == SUCCESS) __scsi_report_device_reset(scmd->device, NULL); return rtn;ok (now that we use scmd->device 3 times in this function we could introduce a local variable sdev, similarly as starget in patch 36; but that would be more changed lines)quoted
@@ -1195,6 +1195,7 @@ scsi_eh_action(struct scsi_cmnd *scmd, enumscsi_disposition rtn) * scsi_eh_finish_cmd - Handle a cmd that eh is finished with.That old comment seems now in front of the new internal __scsi_eh_finish_cmd rathern than scsi_eh_finish_cmd ?quoted
* @scmd: Original SCSI cmd that eh has finished. * @done_q: Queue for processed commands. + * @result: Final command status to be setYou introduce a 3rd argument named "host_byte" (not "result") below?quoted
* * Notes: * We don't want to use the normal command completion while we are arequoted
* still handling errors - it may cause other commands to be queued, * and that would disturb what we are doing. Thus we really want toquoted
@@ -1203,10 +1204,18 @@ scsi_eh_action(struct scsi_cmnd *scmd, enumscsi_disposition rtn) * keep a list of pending commands for final completion, and once we * are ready to leave error handling we handle completion for real. */ -void scsi_eh_finish_cmd(struct scsi_cmnd *scmd, struct list_head *done_q) +void __scsi_eh_finish_cmd(struct scsi_cmnd *scmd, struct list_head *done_q,Should that new internal helper be static?quoted
+ int host_byte) { + if (host_byte) + set_host_byte(scmd, host_byte); list_move_tail(&scmd->eh_entry, done_q); } +I whish we still had a kdoc for the actual API function scsi_eh_finish_cmd().quoted
+void scsi_eh_finish_cmd(struct scsi_cmnd *scmd, struct list_head *done_q) +{ + __scsi_eh_finish_cmd(scmd, done_q, 0); +} EXPORT_SYMBOL(scsi_eh_finish_cmd); /**@@ -1381,7 +1390,8 @@ static int scsi_eh_test_devices(struct list_head*cmd_list, if (finish_cmds && (try_stu || scsi_eh_action(scmd, SUCCESS) == SUCCESS)) - scsi_eh_finish_cmd(scmd, done_q); + __scsi_eh_finish_cmd(scmd, done_q, + DID_RESET); else list_move_tail(&scmd->eh_entry, work_q); }@@ -1529,8 +1539,9 @@ static int scsi_eh_bus_device_reset(structScsi_Host *shost, work_q, eh_entry) { if (scmd->device == sdev && scsi_eh_action(scmd, rtn) != FAILED) - scsi_eh_finish_cmd(scmd, - done_q); + __scsi_eh_finish_cmd(scmd, + done_q, + DID_RESET); } } } else {@@ -1598,7 +1609,8 @@ static int scsi_eh_target_reset(struct Scsi_Host*shost, if (rtn == SUCCESS) list_move_tail(&scmd->eh_entry, &check_list); else if (rtn == FAST_IO_FAIL) - scsi_eh_finish_cmd(scmd, done_q); + __scsi_eh_finish_cmd(scmd, done_q, + DID_TRANSPORT_DISRUPTED); else /* push back on work queue for further processing */ list_move(&scmd->eh_entry, work_q);@@ -1663,8 +1675,9 @@ static int scsi_eh_bus_reset(struct Scsi_Host*shost, list_for_each_entry_safe(scmd, next, work_q, eh_entry) { if (channel == scmd_channel(scmd)) { if (rtn == FAST_IO_FAIL) - scsi_eh_finish_cmd(scmd, - done_q); + __scsi_eh_finish_cmd(scmd, + done_q, + DID_TRANSPORT_DISRUPTED); else list_move_tail(&scmd->eh_entry, &check_list);@@ -1707,9 +1720,9 @@ static int scsi_eh_host_reset(struct Scsi_Host*shost, if (rtn == SUCCESS) { list_splice_init(work_q, &check_list); } else if (rtn == FAST_IO_FAIL) { - list_for_each_entry_safe(scmd, next, work_q, eh_entry) { - scsi_eh_finish_cmd(scmd, done_q); - } + list_for_each_entry_safe(scmd, next, work_q, eh_entry) + __scsi_eh_finish_cmd(scmd, done_q, + DID_TRANSPORT_DISRUPTED); } else { SCSI_LOG_ERROR_RECOVERY(3, shost_printk(KERN_INFO, shost,I don't understand the RESET vs. DISRUPTED depending on escalation level. Care to explain in the patch description (or even code comment)? Is there any functional change compared to today and if so which?
Well. That's arguably slightly odd, but anyway: _in principle_ an EH handler might return FAST_IO_FAIL, which would indicate that the EH function could not complete due the transport being busy (eg RSCN processing taking longer than expected). In that case it's expected to be a transient condition, which might/should be resolved with a simple retry. Hence we should indicate that by completing commands with DID_TRANSPORT_DISRUPTED, to allow upper layers to retry. Having said that it's slightly odd to have FAST_IO_FAIL being returned from a host reset, as really the intention is that the driver will be in a stable state after reset. But this doesn't seem to be true for zfcp, as we're calling fc_block_rport() after the reset itself, which means that there's a chance we'll be running into this situation. Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer