Re: Creating new RDMA driver for habanalabs
From: Jason Gunthorpe <jgg@ziepe.ca>
Date: 2021-08-22 22:31:32
On Sun, Aug 22, 2021 at 12:40:26PM +0300, Oded Gabbay wrote:
Hi Jason, I think that about a year ago we talked about the custom RDMA code of habanalabs. I tried to upstream it and you, rightfully, rejected that. Now that I have enough b/w to do this work, I want to start writing a proper RDMA driver for the habanalabs Gaudi device, which I will be able to upstream to the infiniband subsystem. I don't know if you remember but the Gaudi h/w is somewhat limited in its RDMA capabilities. We are not selling a stand-alone NIC :) We just use RDMA (or more precisely, ROCEv2) to connect between Gaudi devices. I'm sure I will have more specific questions down the line, but I had hoped you could point me to a basic/not-too-complex existing driver that I can use as a modern template. I'm also aware that I will need to write matching code in rdma-core. Also, I would like to add we will use the auxiliary bus feature to connect between this driver, the main (compute) driver and the Ethernet driver (which we are going to publish soon I hope).
It sounds fine, as Leon mentions EFA is a good starting point for something simple but non-spec compliant If I recall properly you'll want to have some special singular PD for the HW and some specialty QPs? Jason