Thread (40 messages) 40 messages, 7 authors, 2024-02-20

Re: [RFC PATCH v3 net-next] Documentation: devlink: Add devlink-sd

From: Jiri Pirko <jiri@resnulli.us>
Date: 2024-02-16 08:15:36

Fri, Feb 16, 2024 at 03:07:29AM CET, kuba@kernel.org wrote:
On Thu, 15 Feb 2024 09:41:31 -0800 Jacob Keller wrote:
quoted
I don't know offhand if we have a device which can share pools
specifically, but we do have multi-PF devices which have a lot of shared
resources. However, due to the multi-PF PCIe design. I looked into ways
to get a single devlink across the devices.. but ultimately got stymied
and gave up.

This left us with accepting the limitation that each PF gets its own
devlink and can't really communicate with other PFs.

The existing solution has just been to partition the shared resources
evenly across PFs, typically via firmware. No flexibility.

I do think the best solution here would be to figure out a generic way
to tie multiple functions into a single devlink representing the device.
Then each function gets the set of devlink_port objects associated to
it. I'm not entirely sure how that would work. We could hack something
together with auxbus.. but thats pretty ugly. Some sort of orchestration
in the PCI layer that could identify when a device wants to have some
sort of "parent" driver which loads once and has ties to each of the
function drivers would be ideal.

Then this parent driver could register devlink, and each function driver
could connect to it and allocate ports and function-specific resources.

Alternatively a design which loads a single driver that maintains
references to each function could work but that requires a significant
change to the entire driver design and is unlikely to be done for
existing drivers...
I think the complexity mostly stems from having to answer what the
"right behavior" is. At least that's what I concluded when thinking
about it back at Netronome :)  If you do a strict hierarchy where
one PF is preassigned the role of the leader, and just fail if anything
unexpected happens - it should be doable. We already kinda have the
model where devlink is the "first layer of probing" and "reload_up()"
is the second.

Have you had a chance to take a closer look at mlx5 "socket direct"
(rename pending) implementation?

BTW Jiri, weren't you expecting that to use component drivers or some
such?
IIRC, turned out that was not suitable for this case by my colleagues.
You have to ask them why, I don't recall.

But socket direct is a different kind of story. There 2/n PFs are just
separate NUMA PCI channels to a single FW entity.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help