Thread (15 messages) 15 messages, 4 authors, 2021-03-03

Re: [PATCH] nvme-rdma: fix crash for no IO queues

From: Chao Leng <hidden>
Date: 2021-02-27 09:31:21


On 2021/2/27 17:12, Hannes Reinecke wrote:
On 2/24/21 6:59 AM, Chao Leng wrote:
quoted

On 2021/2/24 7:21, Keith Busch wrote:
quoted
On Tue, Feb 23, 2021 at 03:26:02PM +0800, Chao Leng wrote:
quoted
A crash happens when set feature(NVME_FEAT_NUM_QUEUES) timeout in nvme
over rdma(roce) reconnection, the reason is use the queue which is not
alloced.

If it is not discovery and no io queues, the connection should fail.
If you're getting a timeout, we need to quit initialization. Hannes
attempted making that status visible for fabrics here:

http://lists.infradead.org/pipermail/linux-nvme/2021-January/022353.html
I know the patch. It can not solve the scenario: target may be an
attacker or the target behavior is incorrect.
If target return 0 io queues or return other error code, the crash will
still happen. We should not allow this to happen.
I'm fully with you that we shouldn't crash, but at the same time a value of '0' for the number of I/O queues is considered valid.
So we should fix the code to handle this scenario, and not disallowing zero I/O queues.
'0' I/O queues doesn't make any sense to nvme over fabrics, it is
different with nvme over pci. If there is some bug with target, we can
debug it in target instead of use admin queue in host.
target may be an attacker or the target behavior is incorrect. So we
should avoid crash. Another option: prohibit  request delivery if
io queue do not created.
I think failed connection with '0' I/O queues is a better choice.
Cheers,

Hannes
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help