Re: [PATCH 0/2] nvme: sanitize KATO handling
From: Chao Leng <hidden>
Date: 2021-02-24 06:42:39
On 2021/2/23 20:07, Hannes Reinecke wrote:
Hi all, one of our customer had been running into a deadlock trying to terminate outstanding KATO commands during reset. Looking closer at it, I found that we never actually _track_ if a KATO command is submitted, so we might happily be sending several KATO commands to the same controller simultaneously.
Can you explain how can send KATO commands simultaneously?
Also, I found it slightly odd that we signal a different KATO value to the controller than what we're using internally; I would have thought that both sides should agree on the same KATO value. And even that wouldn't be so bad, but we really should be using the KATO value we annouonced to the controller when setting the request timeout. With these patches I attempt to resolve the situation; the first patch ensures that only one KATO command to a given controller is outstanding. With that the delay between sending KATO commands and the KATO timeout are decoupled, and we can follow the recommendation from the base spec to send the KATO commands at half the KATO timeout intervals. As usual, comments and reviews are welcome. Hannes Reinecke (2): nvme: fixup kato deadlock nvme: sanitize KATO setting drivers/nvme/host/core.c | 22 +++++++++++++++++----- drivers/nvme/host/fabrics.c | 2 +- drivers/nvme/host/nvme.h | 2 +- 3 files changed, 19 insertions(+), 7 deletions(-)
_______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme