Re: [PATCH] nvme-fabrics: fix crash for no IO queues
From: Sagi Grimberg <sagi@grimberg.me>
Date: 2021-03-15 17:09:31
quoted
quoted
A crash happens when set feature(NVME_FEAT_NUM_QUEUES) timeout in nvme over rdma(roce) reconnection, the reason is use the queue which is not alloced. If queue is not live, should not allow queue request.Can you describe exactly the scenario here? What is the state here? LIVE? or DELETING?If seting feature(NVME_FEAT_NUM_QUEUES) failed due to time out or the target return 0 io queues, nvme_set_queue_count will return 0, and then reconnection will continue and success. The state of controller is LIVE. The request will continue to deliver by call ->queue_rq(), and then crash happens.
Thinking about this again, we should absolutely fail the reconnection when we are unable to set any I/O queues, it is just wrong to keep this controller alive... This should be fixed for both rdma and tcp. _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme