Thread (38 messages) 38 messages, 8 authors, 2020-06-29

Re: [PATCHv3 3/5] nvme: implement I/O Command Sets Command Set support

From: Keith Busch <kbusch@kernel.org>
Date: 2020-06-24 21:49:21
Also in: linux-nvme

On Wed, Jun 24, 2020 at 12:03:41PM -0700, Sagi Grimberg wrote:
quoted
quoted
quoted
quoted
If the controller does not support the CNS value, it may return Invalid
Field with DNR set. That error currently gets propogated back to
nvme_init_ns_head(), which then abandons the namespace. Just as the code
coments say, we had been historically been clearing such errors because
we have other ways to identify the namespace, but now we're not clearing
that error.
I don't understand what you are saying Keith.

My comment was for this block:
-- 
      if (!status && nvme_multi_css(ctrl) && !csi_seen) {
          dev_warn(ctrl->device, "Command set not reported for nsid:%d\n",
               nsid);
          status = -EINVAL;
      }
-- 

I was saying that !status doesn't necessarily mean success, but it can
also mean that its an retry-able error status (due to transport or
controller). If we see a retry-able error we should still clear the
status so we don't abandon the namespace.

This for example would achieve that:
We're not talking about the same thing. I am only talking about what
introduced the DNR check, from commit 59c7c3caaaf87.

I know you added it because you want to abort comparing identifiers on a
rescan when retrieving the identifiers failed. That's fine, but I am
saying this fails namespace creation in the first place for some types
of devices that used to succeed.
OK, now I think I understand (thanks for clarifying that the discussion
is not on patch 3/5, but rather on 59c7c3caaaf87).

So the original proposal was to check NVME_SC_HOST_PATH_ERROR (and now
we have NVME_SC_HOST_ABORTED_CMD) but with the review Christoph gave
it seemed to make more sense that we generalize and check the DNR bit.
Okay, I didn't question this approach when it first went through, so
sorry about this digression, but I really don't get how this DNR check
helps at all.

The code currently returns an error if DNR is set. Based on the commit
messages, it sounds like you need that error to skip comparing
identifiers through nvme's scan_work calling revalidate_disk():
suppressing the error has revalidate_disk() fail with -ENODEV when
comparing identifiers fails.

Since we do return the error when DNR is set, we skip comparing
identifiers and return blk_status_to_errno(nvme_error_status(ret))
instead. How is that errno an improvement?

And then if DNR is not set, we suppress the error and proceed with
comparing identifiers??
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help