Re: nvme tcp receive errors
From: Sagi Grimberg <sagi@grimberg.me>
Date: 2021-05-10 18:18:58
Sagi, Just wanted to give you an update on where we're at with this. All tests run with your earlier patch removing the inline dispatch from nvme_tcp_queue_request() are successful. At this point, I am leaning to remove that optimization from mainline.
Thanks Keith, Did you run it with the extra information debug patch I sent you? What I'm concerned about is that given that you have the only environment where this reproduces, and this is removed it will be very difficult to add it back in. Also, what about the read issue? that one is still unresolved from my PoV.
I added additional tracing to see what is going on, but we eventually hit a memory issue after some hours of runtime. I've never seen an issue like this before, It triggers in nvme_tcp_advance_req() when tracing the rq->tag and req->data_sent: WARNING: CPU: 1 PID: 3428 at arch/x86/include/asm/kfence.h:44 kfence_protect_page+0x33/0xa0 I think the above is a distraction, but I can provide the full stack trace and patch adding the tracepoing if you think it's helpful.
That is... odd.. _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme