Re: [PATCH 0/2] nvme: sanitize KATO handling
From: Hannes Reinecke <hare@suse.de>
Date: 2021-02-24 07:59:27
On 2/24/21 8:20 AM, Chao Leng wrote:
On 2021/2/24 15:06, Hannes Reinecke wrote:quoted
On 2/24/21 7:42 AM, Chao Leng wrote:quoted
On 2021/2/23 20:07, Hannes Reinecke wrote:quoted
Hi all, one of our customer had been running into a deadlock trying to terminate outstanding KATO commands during reset. Looking closer at it, I found that we never actually _track_ if a KATO command is submitted, so we might happily be sending several KATO commands to the same controller simultaneously.Can you explain how can send KATO commands simultaneously?Sure. Call nvme_start_keep_alive() on a dead connection. Just _after_ the KATO request has been sent, call nvme_start_keep_alive() again.Call nvme_start_keep_alive() again? why? Now just nvme_start_ctrl call nvme_start_keep_alive(). The ka_work will be canceled sync before start reconnection. Did I miss something?
My point was that there _can_ be a ka_work() entry even when a KATO command is running. And yes, the ka_work entry will be cancelled, but _before_ the outstanding commands are cancelled. And cancelling the ka_work entry might cause the function to be executed, which leads to a deadlock if blk_mq_get_request() is blocked (eg if the queue is already stopped due to recovery) Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme