Thread (76 messages) 76 messages, 11 authors, 2025-04-22

Re: [RFC PATCH 00/13] Ultra Ethernet driver introduction

From: Jason Gunthorpe <jgg@nvidia.com>
Date: 2025-03-26 14:45:41
Also in: linux-rdma

On Tue, Mar 25, 2025 at 05:02:37PM +0000, Sean Hefty wrote:
quoted
quoted
I view a job as scoped by a network address, versus a system global object.
So, I was looking at a per device scope, though I guess a per port
(similar to a pkey) is also possible.  My reasoning was that a device
_may_ need to allocate some per job resource.  Per device job objects
could be configured to have the same 'job address', for an indirect association.
If I understand UEC's job semantics correctly, then the local scope of a job may
span multiple local ports from multiple local devices.
It would of course translate into device specific reservations.
Agreed.  I should have said job id/address has a network address
scope.  For example, job 3 at 10.0.0.1 _may_ be a different logical
job than job 3 at 10.0.0.2.  Or they could also belong to the same
logical job.  Or the same logical job may use different job id
values for different network addresses.

A device-centric model is more aligned with the RDMA stack.  IMO,
higher-level SW would then be responsible for configuring and
managing the logical job.  For example, maybe it needs to assign and
configure non-RDMA resources as well.  For that reason, I would push
the logical job management outside the kernel subsystem.
Like I said already, I think Job needs to be a first class RDMA object
that is used by all transports that have job semantics.

I expect variation here, UEC made it's choices for how the job headers
are stacked on the wire and I forsee that other protocols will make
different choices.

Jobs may have other data like addresses and encryption keys to define
what packets are part of the job on the network.

So the specific scope of the job may change based on the protocol.

The act of creating a job is really creating a global security object
with some protocol specific properties and must come with a sane
security model to both restrict creation and restrict consuming the
job security object. I favour FD passing for the latter and file
system ACLs for the former.

Jason
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help