Thread (44 messages) 44 messages, 6 authors, 2019-03-21

Re: [PATCH net-next 4/8] devlink: allow subports on devlink PCI ports

From: Jiri Pirko <jiri@resnulli.us>
Date: 2019-02-28 09:06:24

Wed, Feb 27, 2019 at 07:30:00PM CET, jakub.kicinski@netronome.com wrote:
On Wed, 27 Feb 2019 13:37:53 +0100, Jiri Pirko wrote:
quoted
Tue, Feb 26, 2019 at 07:24:32PM CET, jakub.kicinski@netronome.com wrote:
quoted
PCI endpoint corresponds to a PCI device, but such device
can have one more more logical device ports associated with it.
We need a way to distinguish those. Add a PCI subport in the
dumps and print the info in phys_port_name appropriately.

This is not equivalent to port splitting, there is no split
group. It's just a way of representing multiple netdevs on
a single PCI function.

Note that the quality of being multiport pertains only to
the PCI function itself. A PF having multiple netdevs does
not mean that its VFs will also have multiple, or that VFs
are associated with any particular port of a multiport VF.
We've been discussing the problem of subport (we call it "subfunction"
or "SF") for some time internally. Turned out, this is probably harder
task to model. Please prove me wrong.

The nature of VF makes it a logically separate entity. It has a separate
PCI address, it should therefore have a separate devlink instance.
You can pass it through to VM, then the same devlink instance should be
created inside the VM and disappear from the host.
Depends what a devlink instance represents :/  On one hand you may want
to create an instance for a VF to allow it to spawn soft ports, on the
other you may want to group multiple functions together.

IOW if devlink instance is for an ASIC, there should be one per device
per host.  So if we start connecting multiple functions (PFs and/or VFs)
to one host we should probably introduce the notion of devlink aliases
or some such (so that multiple bus addresses can target the same
Hmm. Like VF address -> PF address alias? That would be confusing to see
eswitch ports under VF devlink instance... I probably did not get you
right.

devlink instance).  Those less pipelined NICs can forward between
ports, but still want a function per port (otherwise user space
sometimes gets confused).  If we have multiple functions which are on
the same "switchid" they should have a single devlink instance if you
ask me.  That instance will have all the ports of the device.
Okay, that makes sense. But the question it, can the same devlink
instance contain ports that does not have "Switchid"?

I think it would be beneficial to have the switchid shown for devlink
ports too. Then it is clean that the devlink ports with the same
switchid belong to the same switch, and other ports under the same
devlink instance (like PF itself) is separate, but still under the same
ASIC.

You say disappear from the host - what do you mean.  Are you referring
to the VF port disappearing?  But on the switch the port is still
No, VF itself. eswitch port will be still there on the host.

there, and you should show the subports on the PF side IMHO.  Devlink
ports should allow users to understand the topology of the switch.
What do you mean by "topology"?

Is spawning VMDq sub-instances the only thing we can think of that VMs
may want to do?  Are there any other uses?
quoted
SF (or subport) feels similar to that. Basically it is exactly the same
thing as VF, only does reside under PF PCI function.

That is why I think, for sake of consistency, it should have a separate
devlink entity as well. The problem is correct sysfs modelling and
devlink handle derived from that. Parav is working on a simple soft
bus for this purpose called "subbus". There is a RFC floating around on
Mellanox internal mailing list, looks like it is time to send it
upstream.

Then each PF driver which have SFs would register subbus devices
according to SFs/subports and they would be properly handled by bus
probe, devlink and devlink port and netdev instances created.

Ccing Parav and Jason.
You guys come from the RDMA side of the world, with which I'm less
familiar, and the soft bus + spawning devices seems to be a popular
design there.  Could you describe the advantages of that model for 
the sake of the netdev-only folks? :)
I'll try to draw some ascii art :)
Another term that gets thrown into the mix here is mediated devices,
right?  If you wanna pass the sub-spawn-soft-port to a VM.  Or run 
DPDK on some queues.

To state the obvious AF_XDP and macvlan offload were are previous
answers to some of those use cases.  What is the forwarding model
for those subports?  Are we going to allow flower rules from VMs?
Is it going to be dst MAC only?  Or is the hypervisor going to forward
as it sees appropriate (OvS + "repr"/port netdev)?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help