Thread (53 messages) 53 messages, 11 authors, 2022-06-25

Re: [PATCH v3 00/14] Driver of Intel(R) Gaussian & Neural Accelerator

From: Daniel Vetter <hidden>
Date: 2021-05-17 20:00:30
Also in: dri-devel, lkml

On Mon, May 17, 2021 at 9:49 PM Thomas Zimmermann [off-list ref] wrote:
Hi

Am 17.05.21 um 21:23 schrieb Alex Deucher:
quoted
On Mon, May 17, 2021 at 3:12 PM Thomas Zimmermann [off-list ref]
wrote:
quoted
quoted
Hi

Am 17.05.21 um 09:40 schrieb Daniel Vetter:
quoted
On Fri, May 14, 2021 at 11:00:38AM +0200, Arnd Bergmann wrote:
quoted
On Fri, May 14, 2021 at 10:34 AM Greg Kroah-Hartman
[off-list ref] wrote:
quoted
On Thu, May 13, 2021 at 01:00:26PM +0200, Maciej Kwapulinski wrote:
quoted
Dear kernel maintainers,

This submission is a kernel driver to support Intel(R) Gaussian & Neural
Accelerator (Intel(R) GNA). Intel(R) GNA is a PCI-based neural co-processor
available on multiple Intel platforms. AI developers and users can
offload
quoted
quoted
quoted
quoted
quoted
quoted
continuous inference workloads to an Intel(R) GNA device in order to
free
quoted
quoted
quoted
quoted
processor resources and save power. Noise reduction and speech recognition
are the examples of the workloads Intel(R) GNA deals with while its usage
is not limited to the two.
How does this compare with the "nnpi" driver being proposed here:
          https://lore.kernel.org/r/20210513085725.45528-1-guy.zadicario@intel.com (local)

Please work with those developers to share code and userspace api and
tools.  Having the community review two totally different apis and
drivers for the same type of functionality from the same company is
totally wasteful of our time and energy.
Agreed, but I think we should go further than this and work towards a
subsystem across companies for machine learning and neural networks
accelerators for both inferencing and training.
We have, it's called drivers/gpu. Feel free to rename to drivers/xpu or
think G as in General, not Graphisc.
I hope this was a joke.

Just some thoughts:

AFAICT AI first came as an application of GPUs, but has now
evolved/specialized into something of its own. I can imagine sharing
some code among the various subsystems, say GEM/TTM internals for memory
management. Besides that there's probably little that can be shared in
the userspace interfaces. A GPU is device that puts an image onto the
screen and an AI accelerator isn't. Treating both as the same, even if
they share similar chip architectures, seems like a stretch. They might
evolve in different directions and fit less and less under the same
umbrella.
The putting something on the screen is just a tiny part of what GPUs
do these days.  Many GPUs don't even have display hardware anymore.
Even with drawing APIs, it's just some operation that you do with
memory.  The display may be another device entirely.  GPUs also do
video encode and decode, jpeg acceleration, etc.  drivers/gpu seems
like a logical place to me.  Call it drivers/accelerators if you like.
Other than modesetting most of the shared infrastructure in
drivers/gpu is around memory management and synchronization which are
all the hard parts.  Better to try and share that than to reinvent
that in some other subsystem.
I'm not sure whether we're on the same page or not.

I look at this from the UAPI perspective: the only interfaces that we
really standardize among GPUs is modesetting, dumb buffers, GEM. The
sophisticated rendering is done with per-driver interfaces. And
modesetting is the thing that AI does not do.
Yeah, but the peole who know what should be standardized and what
should not be standardized for accel drivers are here. Because we've
done both models in the past, and pretty much everything in between.

Also like Daniel said, we support hw (and know how to drive it) for
anything from "kernel bashes register values" (gpus worked like that
20 years ago) to "mostly direct userspace submit (amdkfd and parts of
nouveau work like this).

There isn't any other subsystem with that much knowledge about how to
stand up the entire accelerator stack and not making it suck too
badly. That is the real value of dri-devel and the community we have
here, not the code sharing we occasionally tend to do.
Sharing common code among subsystems is not a problem. Many of our
more-sophisticated helpers are located in DRM because no other
subsystems have the requirements yet. Maybe AI now has and we can move
the rsp shareable code to a common location. But AI is still no GPU. To
give a bad analogy: GPUs transmit audio these days. Yet we don't treat
them as sound cards.
We actually do, there are full blown sound drivers for them over in
sound/ (ok I think they're all in sound/hda for pci gpus or in
sound/soc actually). There's some glue to tie it together because it
requires coordination between the gpu and sound side of things, but
that's it.

Also I think it would be extremely silly to remove all the drm_ stuff
just because it's originated from GPUs, and therefore absolutely
cannot be used by other accelarators. I'm not seeing the point in
that, but if someone has convincing technical argument for this we
could do it. A tree wide s/drm_/xpu_ might make some sense perhaps if
that makes people more comfortable with the idea of reusing code from
gpu origins for accelerators in general.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help