Thread (11 messages) 11 messages, 4 authors, 2021-03-17

Re: [PATCH] nvme-fabrics: fix crash for no IO queues

From: Sagi Grimberg <sagi@grimberg.me>
Date: 2021-03-15 17:09:31

quoted
quoted
A crash happens when set feature(NVME_FEAT_NUM_QUEUES) timeout in nvme
over rdma(roce) reconnection, the reason is use the queue which is not
alloced.

If queue is not live, should not allow queue request.
Can you describe exactly the scenario here? What is the state
here? LIVE? or DELETING?
If seting feature(NVME_FEAT_NUM_QUEUES) failed due to time out or
the target return 0 io queues, nvme_set_queue_count will return 0,
and then reconnection will continue and success. The state of controller
is LIVE. The request will continue to deliver by call ->queue_rq(),
and then crash happens.
Thinking about this again, we should absolutely fail the reconnection
when we are unable to set any I/O queues, it is just wrong to
keep this controller alive...

This should be fixed for both rdma and tcp.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help