Thread (75 messages) 75 messages, 10 authors, 2018-01-30

Re: [PATCH v4 0/5] lib: add Port Representors

From: Thomas Monjalon <hidden>
Date: 2018-01-15 16:09:59

15/01/2018 13:12, Doherty, Declan:
On 10/01/2018 7:26 PM, Thomas Monjalon wrote:
quoted
10/01/2018 14:46, Doherty, Declan:
quoted
On 09/01/2018 11:22 PM, Thomas Monjalon wrote:
quoted
Hi,

08/01/2018 15:37, Remy Horton:
quoted
Port Representors provide a logical presentation in DPDK of VF (virtual
function) ports for the purposes of control and monitoring. Each port
representor device represents a single VF and is associated with it's
parent physical function (PF) PMD which provides the back-end hooks for
the representor device ops and defines the control domain to which that
port belongs. This allows to use existing DPDK APIs to monitor and control
the port without the need to create and maintain VF specific APIs.
Extending control plane ability of DPDK has been discussed
multiple times.
It has, and I have yet to see a really strong reason as to why we would
not support control plane functions within DPDK, many of which are
already support today implicitly anyway through our ethdev APIs.
quoted
The current agreed policy is:
"
The primary goal of DPDK is to provide a userspace dataplane.
Managing VFs from a PF driver is a control plane feature and developers
should generally rely on the Linux Kernel for that.
"
http://dpdk.org/doc/guides/contributing/design.html#pf-and-vf-considerations
My understanding is that this particular entry was based around the
discussion on the divergence of functionality between the Linux kernel
PF driver and the DPDK PF driver. I also don't really think the above
statement is valid as a blanket statement for the project as it makes
the assumption that DPDK is only deployed on Linux hosts, what about
FreeBSD? or in the future Windows?
Yes, we must agree on removing this scope limitation while working
on a generic VF representor.
quoted
A number of presentations at both Userspace in Dublin and the Summit
in San Jose discussed the support of control plane functionality by
DPDK and there wasn't any strong arguments or opposition against using
DPDK for control plane functions that I saw.

In any case this patchset is not introducing any new control plane APIs
that don't already exist within DPDK today, it only enables the creation
of a new type of virtual PMDs which are linked to the same base
infrastructure and which can be used to represent VFs in a control plane
application as we have implemented in this patch set.
quoted
If we relax this policy, I think the representor solution should be
a real port, not only "for the purposes of control and monitoring".
It has been asked several times as replies to this series,
but it is kindly ignored, saying it will be thought later.
I think we have stated in multiple discussions, especially during the
userspace presentation back in September that this solution supports
data path on the representors PMDs, and we have used the
infrastructure proposed here to do exactly what you are asking. As the
representor infrastructure doesn't preclude the support of a data
path, we have used it as it is presented here to implement a data path
for exception path packets for a prototype vswitch offload implementation.

quoted
I don't see a general agreement on this series so far.
I think the main issue of contention is that there is a
misunderstanding that this implementation only supports control plane
management and monitoring, but that is not the case and it can be used
for full data path representors, with limited or no control plane
functionality if required, at the end of the day the only limitations
are based on what is implemented by the backend base driver were the
broker is running for the representor ports.
The misunderstanding may originates from what you describe (even in v5):
"ports for the purposes of control and monitoring"
noted, but that is the scope of what we demostrate in the patchset, but 
we'll update the introduction to reflect the fact that they can be used 
to also support data path functions, such as exception path traffic for 
hw switch.
quoted
I think everybody agree to have VF representors in DPDK.
But there are few things which are not generic enough,
and not easy to use.
I hoped the discussion started at Dublin would continue
on the mailing list but I realize the joint effort with other vendors
did not happen.
I will elaborate quickly below and more detailed in later review.

1/ In order to keep track of the relations between PF, VF and
representor - which are all ethdev - you create a struct outside
of ethdev. I think it should be managed inside ethdev API.
Initially we had implemented the representor functionality within the 
context of the ethdev library but ran into a number of scenarios where 
this didn't work well as it makes the assumption that the base device 
that the representors are attached to is always an ethdev, we ran into 
cases were the PF isn't necessarily an ethdev, for example in some 
smartNICs the PF would be better represented by a switchdev, or it is 
possible that the device hosting the representor broker could just 
provide a conduit to a kernel driver.
The base device may be something else than an ethdev PF,
so it may be represented as a rte_device.
But the VF and representor are still ethdev.
I still think the relationship should be described in ethdev.
quoted
As suggested by others, we could also think whether a switchdev API
is interesting or not.
Indeed if a switchdev is something that is required by the community it 
would make sense that the representor infrastructure was initialized 
within the switchdev and not an ethdev. The advantage of keeping the 
representor infrastructure independent is that it gives the flexibility 
for representors to be supported independently of device type they are 
attached to.
switchdev is just another device class.
We must abstract the base device as a basic EAL rte_device for now.
quoted
2/ You create a new library for ethdev device management.
It is the responsibility of the bus to scan and probe devices.
In Intel case, the representor has no real bus so it must rely on
the vdev bus. Note that the new custom scan hook may help.
This isn't the case in latest versions of the patchset, the bus the 
representors are dependent on is that of the base device, so for the 
i40e it's the PF PCI device.
I'm suggesting to use vdev as bus of i40e representor,
because there is no PCI identifier for them, right?
quoted
In Mellanox case, the representor already exists in Linux, and is based
on PCI bus.
Then, as any other port, it must be managed with whitelist or blacklist.
I think the suggestion by Yuanhan of using the device whitelist command 
option makes sense as a option from the commandline, but it would 
require the newly propose implementation which allows specification of 
both the bus and device as not all devices are PCI, which have multiple 
host ports using SR-IOV, but there are cases when an dynamic 
creation/destruction of ports may also need to be supported, which is 
what the representor APIs support.
The full proposed syntax of device identification is:
	http://dpdk.org/ml/archives/dev/2017-December/084572.html
With this generic syntax, we can describe port representors and request
their initialization via whitelisting.

[...]
quoted
3/ You are using PCI address + index to identify the representor.
It is a no-go. We have made effort to abstract buses.
As an idea, the identification of a representor could use the new
proposed flexible device syntax.
We are currently using net_representor_%bus%_%device_id%_%vport_id% to 
identify each representor device but I have no issue changing to either 
the current convention which would be net_representor_%unique_id% or if 
I understand the proposal in the RFC "ether: standardize getting the 
port by name" we would be using something like,
we should be looking at something along the lines of 
net_%bus%_%device_id%_%port_id% which is pretty close to what we are 
using now.
It is introducing yet another syntax.
It would be better to align every device identification usages
on a common and standard syntax.
Then the syntax parsing will be done in bus and libs (e.g. ethdev)
with some helper functions which can be re-used for every usages.
The latest version of representors is using direct PCI parsing instead
of relying on some bus or ethdev helpers.
In terms of that RFC I'm not clear on if the proposal is just to affect 
the API for getting a port by name, or actually the name name assigned 
to the device itself.
It is not about the name. It is a proposal of syntax to describe a
resource of a hardware device, or a virtual device.
The most obvious usage is to replace -w/-b and --vdev.
It can also be used in OVS or any other DPDK configuration.
quoted
4/ Such new API must be experimental.
We will address this in the next revision
quoted
I propose to better think the representor API inside ethdev
with a good multi-vendor collaboration,
and submit a deprecation notice for 18.05 integration.
I would really like to see this included as experimental in 18.02 
release, if it is agreed by the community that we need to re-integate 
the representor concept into librte_ethdev during for 18.05 we will 
support that work.
I don't see the benefit of introducing a lib and a syntax which would
be replaced in the next release.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help