Re: [dpdk-dev] [PATCH] gpudev: introduce memory API
From: Wang, Haiyue <hidden>
Date: 2021-06-07 07:20:22
-----Original Message----- From: Honnappa Nagarahalli <redacted> Sent: Sunday, June 6, 2021 09:14 To: Jerin Jacob <redacted>; Wang, Haiyue <redacted> Cc: thomas@monjalon.net; Andrew Rybchenko <redacted>; Yigit, Ferruh [off-list ref]; dpdk-dev [off-list ref]; Elena Agostini [off-list ref]; David Marchand [off-list ref]; nd [off-list ref]; Honnappa Nagarahalli [off-list ref]; nd [off-list ref] Subject: RE: [dpdk-dev] [PATCH] gpudev: introduce memory API <snip>quoted
quoted
quoted
04/06/2021 17:20, Jerin Jacob:quoted
On Fri, Jun 4, 2021 at 7:39 PM Thomas Monjalon[off-list ref] wrote:quoted
quoted
quoted
quoted
04/06/2021 15:59, Andrew Rybchenko:quoted
On 6/4/21 4:18 PM, Thomas Monjalon wrote:quoted
04/06/2021 15:05, Andrew Rybchenko:quoted
On 6/4/21 3:46 PM, Thomas Monjalon wrote:quoted
04/06/2021 13:09, Jerin Jacob:quoted
On Fri, Jun 4, 2021 at 3:58 PM Thomas Monjalon[off-list ref] wrote:quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
03/06/2021 11:33, Ferruh Yigit:quoted
On 6/3/2021 8:47 AM, Jerin Jacob wrote:quoted
On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon[off-list ref] wrote:quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
+ [gpudev] (@ref rte_gpudev.h),Since this device does not have a queue etc? Shouldn't make it a library like mempool with vendor-defined ops?+1 Current RFC announces additional memory allocation capabilities, which can suits better as extension to existing memory related library instead of a new deviceabstraction library.quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
It is not replacing mempool. It is more at the same level as EAL memory management: allocate simple buffer, but with the exception it is done on a specific device, so it requires a device ID. The other reason it needs to be a full library is that it will start a workload on the GPU and get completion notification so we can integrate the GPU workload in a packetprocessing pipeline.quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
I might have confused you. My intention is not to make to fitunder mempool API.quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
quoted
I agree that we need a separate library for this. My objection is only to not call libgpudev and call it libgpu. And have APIs with rte_gpu_ instead of rte_gpu_dev as it not like existing "device libraries" in DPDK and it like other "libraries" in DPDK.I think we should define a queue of processing actions, so it looks like other device libraries. And anyway I think a library managing a device class, and having some device drivers deserves the name of device library. I would like to read more opinions.Since the library is an unified interface to GPU device drivers I think it should be named as in the patch - gpudev. Mempool looks like an exception here - initially it was pure SW library, but not there are HW backends and corresponding device drivers. What I don't understand where is GPU specifics here?That's an interesting question. Let's ask first what is a GPU for DPDK? I think it is like a sub-CPU with high parallel execution capabilities, and it is controlled by the CPU.I have no good ideas how to name it in accordance with above description to avoid "G" which for "Graphics" if understand correctly. However, may be it is not required. No strong opinion on the topic, but unbinding from "Graphics" would be nice.That's a question I ask myself for months now. I am not able to find a better name, and I start thinking that "GPU" is famous enough in high-load computing to convey the idea of what we can expect.The closest I can think of is big-little architecture in ARM SoC. https://www.arm.com/why-arm/technologies/big-littleFrom the application pov, big-little arch is nothing but SMT. Not sure how it is similar to another device on PCIe.quoted
quoted
quoted
quoted
We do have similar architecture, Where the "coprocessor" is part of the main CPU. It is operations are: - Download firmware - Memory mapping for Main CPU memory by the co-processor - Enq/Deq Jobs from/to Main CPU/Coprocessor CPU.Yes it looks like the exact same scope. I like the word "co-processor" in this context.quoted
If your scope is something similar and No Graphics involved here then we can remove G.Indeed no graphics in DPDK :) By removing the G, you mean keeping only PU? like "pudev"? We could also define the G as "General".quoted
Coincidentally, Yesterday, I had an interaction with Elena for the same for BaseBand related work in ORAN where GPU used as Baseband processing instead of Graphics.(So I can understand the big picture of this library)This patch does not provide the big picture view of what the processing looks like using GPU. It would be good to explain that. For ex: 1) Will the notion of GPU hidden from the application? i.e. is the application allowed to launch kernels? 1a) Will DPDK provide abstract APIs to launch kernels? This would require us to have the notion of GPU in DPDK and the application would depend on the availability of GPU in the system. 2) Is launching kernels hidden? i.e. the application still calls DPDK abstract APIs (such as encryption/decryption APIs) without knowing that the encryption/decryption is happening on GPU. This does not require us to have a notion of GPU in DPDK at the API level If we keep CXL in mind, I would imagine that in the future the devices on PCIe could have their own local memory. May be some of the APIs could use generic names. For ex: instead of calling it as "rte_gpu_malloc" may be we could call it as "rte_dev_malloc". This way any future device which hosts its own memory that need to be managed by the application, can use these APIs.
"rte_dev_malloc" sounds a good name, then looks like we need to enhance the 'struct rte_device' with some new ops as: eal: move DMA mapping from bus-specific to generic driver https://patchwork.dpdk.org/project/dpdk/patch/20210331224547.2217759-1-thomas@monjalon.net/
quoted
quoted
quoted
Yes baseband processing is one possible usage of GPU with DPDK. We could also imagine some security analysis, or any machine learning...quoted
I can think of "coprocessor-dev" as one of the name."coprocessor" looks too long as prefix of the functions.Yes. Libray name can be lengthy, but API prefix should be 3 letters kind short form will be required.quoted
quoted
quoted
We do have similar machine learning co-processors(for compute) if we can keep a generic name and it is for the above functions we may use this subsystem as well in the future.Accelerator, 'acce_dev' ? ;-)It may get confused with HW accelerators. Some of the options I can think of. Sorting in my preference. library name, API prefix 1) libhpc-dev, rte_hpc_ (hpc-> Heterogeneous processor compute) 2) libhc-dev, rte_hc_ (https://en.wikipedia.org/wiki/Heterogeneous_computing see: Example hardware) 3) libpu-dev, rte_pu_ (pu -> processing unit) 4) libhp-dev, rte_hp_ (hp->heterogeneous processor) 5) libcoprocessor-dev, rte_cps_ ? 6) libcompute-dev, rte_cpt_ ? 7) libgpu-dev, rte_gpu_These seem to assume that the application can launch its own workload on the device? Does DPDK need to provide abstract APIs for launching work on a device?quoted
quoted
quoted
Yes that's the idea to share a common synchronization mechanism with different HW. That's cool to have such a big interest in the community for this patch.