Thread (83 messages) 83 messages, 7 authors, 2022-02-23

Re: [PATCH 08/51] zfcp: open-code fc_block_scsi_eh() for host reset

From: Hannes Reinecke <hare@suse.de>
Date: 2021-08-17 14:11:03

On 8/17/21 4:03 PM, Steffen Maier wrote:
On 8/17/21 2:54 PM, Hannes Reinecke wrote:
quoted
On 8/17/21 1:53 PM, Benjamin Block wrote:
quoted
On Tue, Aug 17, 2021 at 11:14:13AM +0200, Hannes Reinecke wrote:
quoted
@@ -383,9 +385,24 @@ static int
zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt)
      }
      zfcp_erp_adapter_reopen(adapter, 0, "schrh_1");
      zfcp_erp_wait(adapter);
-    fc_ret = fc_block_scsi_eh(scpnt);
-    if (fc_ret)
-        ret = fc_ret;
+retry_rport_blocked:
+    spin_lock_irqsave(host->host_lock, flags);
+    list_for_each_entry(port, &adapter->port_list, list) {
You need to take the `adapter->port_list_lock` to iterate over the
`port_list`.

i.e.: read_lock_irqsave(&adapter->port_list_lock, flags);
quoted
+        struct fc_rport *rport = port->rport;
+
+        if (rport->port_state == FC_PORTSTATE_BLOCKED) {
+            if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
+                ret = FAST_IO_FAIL;
+            else
+                ret = NEEDS_RETRY;
+            break;
+        }
+    }
+    spin_unlock_irqrestore(host->host_lock, flags);
+    if (ret == NEEDS_RETRY) {
+        msleep(1000);
+        goto retry_rport_blocked;
+    }
I really can't say I like this open coded FC code in the driver at all.

Is there a reason we can't use `fc_block_rport()` for all the rports of
the adapter?
Waiting for all rports to unblock in host_reset has been on my todo list
since we prepared the eh callbacks to get rid of scsi_cmnd with v4.18
commits:
674595d8519f ("scsi: zfcp: decouple our scsi_eh callbacks from scsi_cmnd")
42afc6527d43 ("scsi: zfcp: decouple TMFs from scsi_cmnd by using
fc_block_rport")
26f5fa9d47c1 ("scsi: zfcp: decouple SCSI setup of TMF from scsi_cmnd")
39abb11aca00 ("scsi: zfcp: decouple FSF request setup of TMF from
scsi_cmnd")
e0116c91c7d8 ("scsi: zfcp: split FCP_CMND IU setup between SCSI I/O and
TMF again")
266883f2f7d5 ("scsi: zfcp: decouple TMF response handler from scsi_cmnd")
822121186375 ("scsi: zfcp: decouple SCSI traces for scsi_eh / TMF from
scsi_cmnd")

But the synchronization is non-trivial as Benjamin's question shows.
There are also considerations about lock order, etc.

I'm busy with other things, so don't hold your breath until I can review
and test the code; I don't want any regression in that recovery code.
quoted
quoted
We already do use it for other EH callbacks in the same file, and you
already look up the rports in the adapters rport-list; so using that on
the rports in the loop, instead of open-coding it doesn't seem bad? Or
is there a locking problem?

We might waste a few cycles with that, but frankly, this is all in EH
and after adapter reset.. all performance concerns went our of the
window with that already.
Question would be why we need to call fc_block_rport() at all in host
reset.
To my understanding a host reset is expected to do a full resync of the
SAN topology, so the expectation is that after zfcp_erp_wait() the port
list is stable (ie the HBA has finished processing all RSCNs related to
the SAN resync).
There is more to do in zfcp than in other FC HBA drivers, e.g. LUN open
recoveries and how they related to rport unblock:
v4.10 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN
recovery").
The rport unblock is async to our internal recovery. zfcp_erp_wait()
only waits for the latter by design.
quoted
So can't we just drop the fc_block_rport() call here?
I don't think so.
quoted
All the other FC drivers do fine without that ...
It would have been nice to have a common interface for all scsi_eh
scopes. I.e. fc_block_host(struct Scsi_Host*) like we already have for
fc_block_scsi_eh(struct scsi_cmnd*) and fc_block_rport(struct fc_rport*)
[the latter having been introduced at the time of above eh callback
preparations].
But if zfcp is the only one needing it for host_reset, having the code
only in zfcp seems fine to me.
Right. Just wanted to clarify that.
If we need to use fc_block_rport() in host reset so be it; just wanted
to clarify if this _really_ is the case (and not just some copy'n'paste
stuff).
I'll be reworking the patch to call fc_block_rport().

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help