Thread (48 messages) 48 messages, 9 authors, 2021-06-24

Re: [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

From: Daniel Vetter <hidden>
Date: 2021-06-21 14:20:44
Also in: amd-gfx, dri-devel, linux-media, lkml

On Mon, Jun 21, 2021 at 03:02:10PM +0200, Greg KH wrote:
On Mon, Jun 21, 2021 at 02:28:48PM +0200, Daniel Vetter wrote:
quoted
On Fri, Jun 18, 2021 at 2:36 PM Oded Gabbay [off-list ref] wrote:
quoted
User process might want to share the device memory with another
driver/device, and to allow it to access it over PCIe (P2P).

To enable this, we utilize the dma-buf mechanism and add a dma-buf
exporter support, so the other driver can import the device memory and
access it.

The device memory is allocated using our existing allocation uAPI,
where the user will get a handle that represents the allocation.

The user will then need to call the new
uAPI (HL_MEM_OP_EXPORT_DMABUF_FD) and give the handle as a parameter.

The driver will return a FD that represents the DMA-BUF object that
was created to match that allocation.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Tomer Tayar <redacted>
Mission acomplished, we've gone full circle, and the totally-not-a-gpu
driver is now trying to use gpu infrastructure. And seems to have
gained vram meanwhile too. Next up is going to be synchronization
using dma_fence so you can pass buffers back&forth without stalls
among drivers.
What's wrong with other drivers using dmabufs and even dma_fence?  It's
a common problem when shuffling memory around systems, why is that
somehow only allowed for gpu drivers?

There are many users of these structures in the kernel today that are
not gpu drivers (tee, fastrpc, virtio, xen, IB, etc) as this is a common
thing that drivers want to do (throw chunks of memory around from
userspace to hardware).

I'm not trying to be a pain here, but I really do not understand why
this is a problem.  A kernel api is present, why not use it by other
in-kernel drivers?  We had the problem in the past where subsystems were
trying to create their own interfaces for the same thing, which is why
you all created the dmabuf api to help unify this.
It's the same thing as ever. 90% of an accel driver are in userspace,
that's where all the fun is, that's where the big picture review needs to
happen, and we've very conveniently bypassed all that a few years back
because it was too annoying.

Once we have the full driver stack and can start reviewing it I have no
objections to totally-not-gpus using all this stuff too. But until we can
do that this is all just causing headaches.

Ofc if you assume that userspace doesn't matter then you don't care, which
is where this giantic disconnect comes from.

Also unless we're actually doing this properly there's zero incentive for
me to review the kernel code and check whether it follows the rules
correctly, so you have excellent chances that you just break the rules.
And dma_buf/fence are tricky enough that you pretty much guaranteed to
break the rules if you're not involved in the discussions. Just now we
have a big one where everyone involved (who's been doing this for 10+
years all at least) realizes we've fucked up big time.

Anyway we've had this discussion, we're not going to move anyone here at
all, so *shrug*. I'll keep seeing accelarators in drivers/misc as blantant
bypassing of review by actual accelerator pieces, you keep seing dri-devel
as ... well I dunno, people who don't know what they're talking about
maybe. Or not relevant to your totally-not-a-gpu thing.
quoted
Also I'm wondering which is the other driver that we share buffers
with. The gaudi stuff doesn't have real struct pages as backing
storage, it only fills out the dma_addr_t. That tends to blow up with
other drivers, and the only place where this is guaranteed to work is
if you have a dynamic importer which sets the allow_peer2peer flag.
Adding maintainers from other subsystems who might want to chime in
here. So even aside of the big question as-is this is broken.
From what I can tell this driver is sending the buffers to other
instances of the same hardware, as that's what is on the other "end" of
the network connection.  No different from IB's use of RDMA, right?
There's no import afaict, but maybe I missed it. Assuming I haven't missed
it the importing necessarily has to happen by some other drivers.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help