Re: [PATCH net-next 6/8] devlink: introduce port's peer netdevs
From: Jiri Pirko <jiri@resnulli.us>
Date: 2019-03-01 07:46:58
Thu, Feb 28, 2019 at 05:36:44PM CET, jakub.kicinski@netronome.com wrote:
On Thu, 28 Feb 2019 10:00:54 +0100, Jiri Pirko wrote:quoted
Wed, Feb 27, 2019 at 07:47:42PM CET, jakub.kicinski@netronome.com wrote:quoted
On Wed, 27 Feb 2019 14:08:29 +0100, Jiri Pirko wrote:quoted
Tue, Feb 26, 2019 at 07:24:34PM CET, jakub.kicinski@netronome.com wrote:quoted
Devlink ports represent ports of a switch device (or SR-IOV NIC which has an embedded switch). In case of SR-IOV when PCIe PFs are exposed the PFs which are directly connected to the local machine may also spawn PF netdev (much like VFs have a port/"repr" and an actual VF netdev). Allow devlink to expose such linking. There is currently no way to find out which netdev corresponds to which PF. Example: $ devlink port pci/0000:82:00.0/0: type eth netdev p4p1 flavour physical pci/0000:82:00.0/10000: type eth netdev eth1 flavour pci_pf pf 0 peer_netdev enp130s0 pci/0000:82:00.0/10001: type eth netdev eth0 flavour pci_vf pf 0 vf 0 pci/0000:82:00.0/10002: type eth netdev eth2 flavour pci_vf pf 0 vf 1Peer as the other side of a "virtual cable". For PF, that is probably sufficient. But I think what a "peer of devlink port" should be "a devlink port".Maybe I'm not clear on what devlink port is - to me its a port of the ASIC. The notion of devlink port connected to devlink port seems to counter such definition :S"port of the ASIC" in a sence of "eswitch ports"?Yes.quoted
quoted
I do not think that every netdev should have a devlink port associated.quoted
Not sure about VF. Consider a simple problem of setting up a VF mac address. In legacy, you do it like this: $ ip link set eth2 vf 1 mac 00:52:44:11:22:33 However, in new model, you so far cannot do that.Why? $ devlink port set pci/0000:82:00.0/10001 peer_eth_addr 00:52:44:11:22:33Yeah. That is not yet implemented. I agree it is most straightforward. The question is, is it fine to have set of: peer_eth_addr peer_mtu peer_something_else Or rather to have some object to pin this on. Something like: $ devlink port peer set pci/0000:82:00.0/10001 eth_addr 00:52:44:11:22:33I do like the object one better, would this mean I should restructure the peer stuff somehow (netlink attribute structure)?
Well we can introduce separate commands: DEVLINK_CMD_PORT_PEER_GET DEVLINK_CMD_PORT_PEER_SET For "set" part, this would work nice. However for the "get" part, we would have to call both DEVLINK_CMD_PORT_GET and DEVLINK_CMD_PORT_PEER_GET. So probably better to add a nest attr: DEVLINK_ATTR_PORT_PEER and have attrs like: DEVLINK_ATTR_PORT_PEER_HW_ADDR (does not have to be always eth, right?) DEVLINK_ATTR_PORT_PEER_TYPE (DEVLINK_PORT_TYPE_NOTSET/DEVLINK_PORT_TYPE_ETH/DEVLINK_PORT_TYPE_IB) DEVLINK_ATTR_PORT_PEER_NETDEV_IFINDEX DEVLINK_ATTR_PORT_PEER_NETDEV_NAME DEVLINK_ATTR_PORT_PEER_NETDEV_IBDEV_NAME in the nest. The userspace part can stay as I described previously: $ devlink port peer set pci/0000:82:00.0/10001 hw_addr 00:52:44:11:22:33 Not sure about "port show" output. In json, the "peer" things should be under "peer" dictionary.
The MTU stuff is tricky, perhaps best left for its own series ;)quoted
quoted
It's more of a neighbour info situation than a local port situation.quoted
What I was thinking about was some "dummy peer" which would be on the host. Not sure if only as a "dummy peer devlink port" or even as some sort of "dummy netdev". One way or another, it would provide the user some info about which VF representor is connected to which VF in VM (mac mapping).Ack, but isn't the MAC setting is the only thing we're missing from "switchdev SR-IOV"? Would the "dummy netdev" be used for anything else? I would rather not introduce new netdev just to do thatAgreed. It was just a wild idea :):)quoted
quoted
(that'd be a third for that port.)