Re: [RFC PATCH v2 02/11] netdev: implement netlink api to bind dma-buf to netdevice
From: Mina Almasry <hidden>
Date: 2023-08-19 18:00:05
On Sat, Aug 19, 2023 at 7:19 AM Willem de Bruijn [off-list ref] wrote:
On Fri, Aug 18, 2023 at 11:30 PM David Ahern [off-list ref] wrote:quoted
On 8/18/23 8:06 PM, Jakub Kicinski wrote:quoted
On Fri, 18 Aug 2023 19:34:32 -0600 David Ahern wrote:quoted
On 8/18/23 3:52 PM, Mina Almasry wrote:quoted
The sticking points are: 1. From David: this proposal doesn't give an application the ability to flush an rx queue, which means that we have to rely on a driver reset that affects all queues to refill the rx queue buffers.Generically, the design needs to be able to flush (or invalidate) all references to the dma-buf once the process no longer "owns" it.Are we talking about the ability for the app to flush the queue when it wants to (do no idea what)? Or auto-flush when app crashes?If a buffer reference can be invalidated such that a posted buffer is ignored by H/W, then no flush is needed per se. Either way the key point is that posted buffers can no longer be filled by H/W once a process no longer owns the dma-buf reference. I believe the actual mechanism here will vary by H/W.Right. Many devices only allow bringing all queues down at the same time.
FWIW, I spoke with our Praveen (GVE maintainer) about this. Suspicion is that bringing up/down individual queues _should_ work with GVE for the most part, but it's pending me trying it and confirming. I think if a driver can't support bringing up/down individual queues, then Jakub's direction for per queue configs all cannot be done on that driver (queue_mem_alloc, queue_mem_free, queue_start, queue_stop), and addressing David's concerns vis-a-vis dma-buf being auto-detached if the application crashes/exists also cannot be done. The driver will not be able to support device memory TCP unless there is an option to make it work with a full driver reset.
Once a descriptor is posted and the ring head is written, there is no way to retract that. Since waiting for the device to catch up is not acceptable, the only option is to bring down the queue, right? Which will imply bringing down the entire device on many devices. Not ideal, but acceptable short term, imho.
I also wonder if it may be acceptable to have both modes supported. I.e. (roughly): 1. Add APIs that create an rx-queue bound to a dma-buf. 2. Add APIs that bind an rx-queue to a dma-buf. Drivers that support per-queue allocation/freeing can support and use #1 and can work as David likes. Drivers that cannot allocate or bring up individual queues can only support #2, and trigger a driver-reset to refill or release the dma-buf references. This patch series already implements APIs #2.
That may be an incentive for vendors to support per-queue start/stop/alloc/free. Maybe the ones that support RDMA already do?
-- Thanks, Mina