Thread (33 messages) 33 messages, 10 authors, 2021-09-02

Re: [RFC] Make use of non-dynamic dmabuf in RDMA

From: Gal Pressman <hidden>
Date: 2021-08-20 12:59:05
Also in: dri-devel, linux-media, lkml

On 20/08/2021 15:33, Jason Gunthorpe wrote:
On Fri, Aug 20, 2021 at 09:25:30AM +0200, Daniel Vetter wrote:
quoted
On Fri, Aug 20, 2021 at 1:06 AM Jason Gunthorpe [off-list ref] wrote:
quoted
On Wed, Aug 18, 2021 at 11:34:51AM +0200, Daniel Vetter wrote:
quoted
On Wed, Aug 18, 2021 at 9:45 AM Gal Pressman [off-list ref] wrote:
quoted
Hey all,

Currently, the RDMA subsystem can only work with dynamic dmabuf
attachments, which requires the RDMA device to support on-demand-paging
(ODP) which is not common on most devices (only supported by mlx5).

While the dynamic requirement makes sense for certain GPUs, some devices
(such as habanalabs) have device memory that is always "pinned" and do
not need/use the move_notify operation.

The motivation of this RFC is to use habanalabs as the dmabuf exporter,
and EFA as the importer to allow for peer2peer access through libibverbs.

This draft patch changes the dmabuf driver to differentiate between
static/dynamic attachments by looking at the move_notify op instead of
the importer_ops struct, and allowing the peer2peer flag to be enabled
in case of a static exporter.

Thanks

Signed-off-by: Gal Pressman <redacted>
Given that habanalabs dma-buf support is very firmly in limbo (at
least it's not yet in linux-next or anywhere else) I think you want to
solve that problem first before we tackle the additional issue of
making p2p work without dynamic dma-buf. Without that it just doesn't
make a lot of sense really to talk about solutions here.
I have been thinking about adding a dmabuf exporter to VFIO, for
basically the same reason habana labs wants to do it.

In that situation we'd want to see an approach similar to this as well
to have a broad usability.

The GPU drivers also want this for certain sophisticated scenarios
with RDMA, the intree drivers just haven't quite got there yet.

So, I think it is worthwhile to start thinking about this regardless
of habana labs.
Oh sure, I've been having these for a while. I think there's two options:
- some kind of soft-pin, where the contract is that we only revoke
when absolutely necessary, and it's expected to be catastrophic on the
importer's side. 
Honestly, I'm not very keen on this. We don't really have HW support
in several RDMA scenarios for even catastrophic unpin.

Gal, can EFA even do this for a MR? You basically have to resize the
rkey/lkey to zero length (or invalidate it like a FMR) under the
catstrophic revoke. The rkey/lkey cannot just be destroyed as that
opens a security problem with rkey/lkey re-use.
I had some discussions with our hardware guys about such functionality in the
past, I think it should be doable (not necessarily the same way that FMR does it).

Though it would've been nicer if we could agree on a solution that could work
for more than 1-2 RDMA devices, using the existing tools the RDMA subsystem has.
That's why I tried to approach this by denying such attachments for non-ODP
importers instead of exposing a "limited" dynamic importer.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help