Re: [PATCH v9 13/13] lpfc: vmid: Introducing vmid in io path.
From: Hannes Reinecke <hare@suse.de>
Date: 2021-04-12 05:27:18
Also in:
linux-nvme, linux-scsi
On 4/10/21 5:00 PM, James Smart wrote:
On 4/8/2021 1:46 AM, Hannes Reinecke wrote:quoted
On 4/7/21 1:06 AM, Muneendra wrote:quoted
From: Gaurav Srivastava <redacted>...quoted
quoted
+ /* while the read lock was released, in case the entry was */ + /* added by other context or is in process of being added */ + if (vmp && vmp->flag & LPFC_VMID_REGISTERED) { + lpfc_vmid_update_entry(vport, cmd, vmp, tag); + write_unlock(&vport->vmid_lock); + return 0; + } else if (vmp && vmp->flag & LPFC_VMID_REQ_REGISTER) { + write_unlock(&vport->vmid_lock); + return -EBUSY; + } + + /* else search and allocate a free slot in the hash table */ + if (vport->cur_vmid_cnt < vport->max_vmid) { + for (i = 0; i < vport->max_vmid; ++i) { + vmp = vport->vmid + i; + if (vmp->flag == LPFC_VMID_SLOT_FREE) { + vmp = vport->vmid + i;delete this last line and adjust parens - really odd that it replicates the assignment 2 lines earlier.quoted
quoted
+ break; + } + }I would prefer that if the table is expended and no slots free, that -ENOMEM is returned here. Rather than falling down below and qualifying by slot free, then by pending (set only if slot free). I can't believe there is a reason the idle timer has to be started if no slots free as all the other fail cases don't bother with it. This also helps indentation levels below.quoted
quoted
+ } else { + write_unlock(&vport->vmid_lock); + return -ENOMEM; + } + + if (vmp && (vmp->flag == LPFC_VMID_SLOT_FREE)) { + /* Add the vmid and register */ + lpfc_put_vmid_in_hashtable(vport, hash, vmp); + vmp->vmid_len = len; + memcpy(vmp->host_vmid, uuid, vmp->vmid_len); + vmp->io_rd_cnt = 0; + vmp->io_wr_cnt = 0; + vmp->flag = LPFC_VMID_SLOT_USED; + + vmp->delete_inactive = + vport->vmid_inactivity_timeout ? 1 : 0; + + /* if type priority tag, get next available vmid */ + if (lpfc_vmid_is_type_priority_tag(vport)) + lpfc_vmid_assign_cs_ctl(vport, vmp); + + /* allocate the per cpu variable for holding */ + /* the last access time stamp only if vmid is enabled */ + if (!vmp->last_io_time) + vmp->last_io_time = + __alloc_percpu(sizeof(u64), + __alignof__(struct + lpfc_vmid)); + + /* registration pending */ + pending = 1; + } else { + rc = -ENOMEM; + } + write_unlock(&vport->vmid_lock); + + /* complete transaction with switch */ + if (pending) { + if (lpfc_vmid_is_type_priority_tag(vport)) + rc = lpfc_vmid_uvem(vport, vmp, true); + else + rc = lpfc_vmid_cmd(vport, + SLI_CTAS_RAPP_IDENT, + vmp); + if (!rc) { + write_lock(&vport->vmid_lock); + vport->cur_vmid_cnt++; + vmp->flag |= LPFC_VMID_REQ_REGISTER; + write_unlock(&vport->vmid_lock); + } + } + + /* finally, enable the idle timer once */ + if (!(vport->phba->pport->vmid_flag & LPFC_VMID_TIMER_ENBLD)) { + mod_timer(&vport->phba->inactive_vmid_poll, + jiffies + + msecs_to_jiffies(1000 * LPFC_VMID_TIMER)); + vport->phba->pport->vmid_flag |= LPFC_VMID_TIMER_ENBLD; + } + } + return rc; +} + +/* + * lpfc_is_command_vm_io - get the uuid from blk cgroup + * @cmd:Pointer to scsi_cmnd data structure + * Returns uuid if present if not null + */ +static char *lpfc_is_command_vm_io(struct scsi_cmnd *cmd) +{ + char *uuid = NULL; + + if (cmd->request) { + if (cmd->request->bio) + uuid = blkcg_get_fc_appid(cmd->request->bio); + } + return uuid; +} + /** * lpfc_queuecommand - scsi_host_template queuecommand entry point * @shost: kernel scsi host pointer.@@ -5288,6 +5437,7 @@ lpfc_queuecommand(struct Scsi_Host *shost,struct scsi_cmnd *cmnd) int err, idx; #ifdef CONFIG_SCSI_LPFC_DEBUG_FS uint64_t start = 0L; + u8 *uuid = NULL; if (phba->ktime_on) start = ktime_get_ns();@@ -5415,6 +5565,25 @@ lpfc_queuecommand(struct Scsi_Host *shost,struct scsi_cmnd *cmnd) } + /* check the necessary and sufficient condition to support VMID */ + if (lpfc_is_vmid_enabled(phba) && + (ndlp->vmid_support || + phba->pport->vmid_priority_tagging == + LPFC_VMID_PRIO_TAG_ALL_TARGETS)) { + /* is the IO generated by a VM, get the associated virtual */ + /* entity id */ + uuid = lpfc_is_command_vm_io(cmnd); + + if (uuid) { + err = lpfc_vmid_get_appid(vport, uuid, cmnd, + (union lpfc_vmid_io_tag *) + &lpfc_cmd->cur_iocbq.vmid_tag); + if (!err) + lpfc_cmd->cur_iocbq.iocb_flag |= LPFC_IO_VMID; + } + } + + atomic_inc(&ndlp->cmd_pending); #ifdef CONFIG_SCSI_LPFC_DEBUG_FS if (unlikely(phba->hdwqstat_on & LPFC_CHECK_SCSI_IO)) this_cpu_inc(phba->sli4_hba.c_stat->xmt_io);And that's the bit which I don't particular like. Essentially we'll have to inject additional ELS commands _on each I/O_ to get a valid VMID. Where there are _so_ many things which might get wrong, causing an I/O stall.I don't follow you - yes ELS's are injected when there isn't an entry for the VM, but once there is, there isn't further ELS's. That is the cost. as we don't know what uuid's I/O will be sent to before hand, so it has to be siphoned off during the I/O flow. I/O's can be sent non-tagged while the ELS's are completing (and there aren't multiple sets of ELS's as long as it's the same uuid), which is fine. so I disagree with "_on each I/O_".
Yeah, that was an exaggeration. Not on each I/O; I should've said 'on each unregistered I/O'. This really is my unhappiness with the entire VMID specification, where you can time out VMIDs after a certain period, and hence never be sure if any particular I/O really _is_ registered.
quoted
I would have vastly preferred if we could _avoid_ having to do additional ELS commands for VMID registration in the I/O path (ie only allow for I/O with a valid VMID), and reject the I/O otherwise until VMID registration is complete. IE return 'BUSY' (or even a command retry?) when no valid VMID for this particular I/O is found, register the VMID (preferably in another thread), and restart the queue once the VMID is registered.Why does it bother you with the I/O path ? It's actually happening in parallel with no real relation between the two. I seriously disagree with reject if no vmid tag. Why? what do you gain ? This actually introduces more disruption than the parallel flow with the ELSs. Also, after link bounce, where all VMID's have to be done, it adds a stall window after link up right when things are trying to resume after rports rejoin. Why add the i/o rebouncing ? There no real benefit. Issuing a few untagged I/O doesn't hurt.
That indeed is a valid point. I retract my objection. Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer