Thread (49 messages) 49 messages, 8 authors, 2021-03-05

Re: [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS profile

From: Dumitrescu, Cristian <hidden>
Date: 2021-03-05 18:45:00

-----Original Message-----
From: Matan Azrad <redacted>
Sent: Thursday, March 4, 2021 6:34 AM
To: Dumitrescu, Cristian <redacted>; Li Zhang
[off-list ref]; Dekel Peled [off-list ref]; Ori Kam
[off-list ref]; Slava Ovsiienko [off-list ref]
Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
[off-list ref]; Raslan Darawsheh [off-list ref];
mb@smartsharesystems.com; ajit.khaparde@broadcom.com; Yigit, Ferruh
[off-list ref]; Singh, Jasvinder [off-list ref]
Subject: RE: [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS profile

Hi Cristian

From: Dumitrescu, Cristian
quoted
Hi Matan,
quoted
-----Original Message-----
From: Matan Azrad <redacted>
Sent: Tuesday, March 2, 2021 6:10 PM
To: Dumitrescu, Cristian <redacted>; Li Zhang
[off-list ref]; Dekel Peled [off-list ref]; Ori Kam
[off-list ref]; Slava Ovsiienko [off-list ref]
Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
[off-list ref];
quoted
quoted
Raslan Darawsheh [off-list ref]; mb@smartsharesystems.com;
ajit.khaparde@broadcom.com; Yigit, Ferruh [off-list ref];
Singh, Jasvinder [off-list ref]
Subject: RE: [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS profile

Hi Cristian

Good discussion, thank you for that!

From: Dumitrescu, Cristian
quoted
Hi Matan,
quoted
-----Original Message-----
From: Matan Azrad <redacted>
Sent: Tuesday, March 2, 2021 12:37 PM
To: Dumitrescu, Cristian <redacted>; Li Zhang
[off-list ref]; Dekel Peled [off-list ref]; Ori Kam
[off-list ref]; Slava Ovsiienko [off-list ref]
Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
[off-list ref];
quoted
quoted
Raslan Darawsheh [off-list ref];
mb@smartsharesystems.com;
quoted
quoted
quoted
quoted
ajit.khaparde@broadcom.com; Yigit, Ferruh
[off-list ref]; Singh, Jasvinder
[off-list ref]
Subject: RE: [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS profile

HI Cristian

From: Dumitrescu, Cristian
quoted
Hi Matan,
quoted
-----Original Message-----
From: Matan Azrad <redacted>
Sent: Tuesday, March 2, 2021 7:02 AM
To: Dumitrescu, Cristian <redacted>; Li
Zhang [off-list ref]; Dekel Peled [off-list ref]; Ori
Kam [off-list ref]; Slava Ovsiienko
[off-list ref]
Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
[off-list ref];
quoted
quoted
Raslan Darawsheh [off-list ref];
mb@smartsharesystems.com;
quoted
quoted
quoted
quoted
ajit.khaparde@broadcom.com; Yigit, Ferruh
[off-list ref]; Singh, Jasvinder
[off-list ref]
Subject: RE: [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS
profile



Hi Cristian

Thank you for review, please see inline.

From: Dumitrescu, Cristian
quoted
quoted
From: dev <redacted> On Behalf Of Li Zhang
<snip>
quoted
We had this same problem earlier for the rte_tm.h API, where
people
asked to
quoted
add support for WRED and shaper rates specified in packets
to the existing
byte
quoted
rate support. I am more than happy to support adding the
same here, but please let's adopt the same solution here
rather than invent a different approach.

Please refer to struct rte_tm_wred_params and struct
rte_tm_shaper_params
quoted
from rte_tm.h: the packets vs. bytes mode is explicitly
specified through
the use
quoted
of a flag called packet_mode that is added to the WRED and
shaper
profile.
quoted
quoted
quoted
When packet_mode is 0, the profile rates and bucket sizes
are specified in bytes per second and bytes, respectively;
when packet_mode is not 0, the profile rates and bucket
sizes are specified in packets and packets per
second,
quoted
respectively. The same profile parameters are used, no need
to invent additional algorithms (such as srTCM - packet
mode) or profile data
structures.
quoted
Can we do the same here, please?
This flag approach is very intuitive suggestion and it has
advantages.
quoted
quoted
quoted
quoted
quoted
quoted
The main problem with the flag approach is that it breaks ABI and
API.
quoted
quoted
quoted
quoted
quoted
quoted
The profile structure size is changed due to a new field - ABI
breakage.
quoted
quoted
quoted
quoted
The user must initialize the flag with zero to get old
behavior - API
breakage.
quoted
quoted
The rte_mtr API is experimental, all the API functions are
correctly marked with __rte_experimental in rte_mtr.h file, so
we can safely change the API
and
quoted
the ABI breakage is not applicable here. Therefore, this problem
does not
exist,
quoted
correct?
Yes, but still meter is not new API and I know that a lot of user
uses it for a long time.
Forcing them to change while we have good solution that don't
force it, looks me problematic.
Not really, only 3 drivers are currently implementing this API.
The user is not the PMD, the PMDs are the providers.
I'm talking about all our customers, all the current DPDK based
applications like OVS and others (I familiar with at least 4 ConnectX
customer applications) which use the meter API and I'm sure there are
more
quoted
around the world.
quoted
quoted
Even to these drivers, the required changes are none or extremely
small:
quoted
quoted
as Ajit
quoted
was also noting, as the default value of 0 continues to represent
the
existing
quoted
byte mode, all you have to do is make sure the new flag is set to
zero in the profile params structure, which is already done
implicitly in most places as
this
quoted
structure is initialized to all-zeros.
Are you sure all the world initialize the struct to 0? and also in
this case, without new compilation, not all the struct will be
zeroes(the old size is smaller).
quoted
A simple search exercise for struct rte_mtr_meter_profile is all
that is
needed.
quoted
You also agreed the flag approach is very intuitive, hence better
and nicer,
with
quoted
no additional work needed for you, so why not do it?
Do you understand that any current application that use the meter API
must recompile the code of the application? Part of them also probably
need to set the flag to 0....
Do you understand also the potential issues for the applications which
are not aware to the change? Debug time, etc....
quoted
quoted
quoted
quoted
I don't see issues with Li suggestion, Do you think Li
suggestion has critical issues?
It is probably better to keep the rte_mtr and the rte_tm APIs
aligned, it simplifies the code maintenance and improves the
user experience, which always pays off in the long run. Both
APIs configure token buckets in either packet mode or byte mode,
and it is desirable to have them work in the
same
quoted
way. Also, I think we should avoid duplicating configuration
data structures
for
quoted
to support essentially the same algorithms (such as srTCM or
trTCM) if we
can.
quoted
Yes, but I don't think this motivation is critical.
I really disagree. As API maintainer, making every effort to keep
the APIs
clear
quoted
and consistent is a critical task for me.
New pps profile is also clear and simple.
quoted
We don't want to proliferate the API data structures and parameters
if there is a good way to avoid it. Especially
in
quoted
cases like this, when the drivers are just beginning to pick up this
(still
experimental) API,  we have the rare chance to make things right and
therefore
quoted
we should do it. Please also keep in mind that, as more feature are
added
to
quoted
the API, small corner cuts like this one that might not look like a
big deal
now,
quoted
eventually come back as unnecessary complexity in the drivers
themselves.
quoted
quoted
I don't see a complexity in the current suggestion.
quoted
So, please, let's try to keep the quality of the APIs high.
Also by this way is high.


Look, the flag approach is also good and makes the job.
The two approaches are clear, simple and in high quality.
I don't care which one from the 2 to take but I want to be sure we are
all understand the pros and cons.

If you understand my concern on flag approach and still insist to take
the flag approach we will align.
Yes, thanks for summarizing the pros and cons, I confirm that I do
understand
quoted
your concerns.

Yes, sorry to disappoint you, I still think the packet_mode based approach
is
quoted
better for the long run, as it keeps the APIs clean and consistent. We are
not
quoted
adding new algorithms here, we're just adding a new mode to an existing
algorithm, so IMO we should not duplicate configuration data structures
and
quoted
proliferate the number of algorithms artificially.
Actually, PPS meter is a new algorithm - you can see that current algorithms
RFCs don't talk about PPS.
Yes, I know, I implemented it in librte_meter, but still, it is the same algorithm with just a different measurement unit (packet instead of byte), that's why many people (and you included :) ) still refer to it as srTCM  - RFC 2697.
quoted
Yes, I do realize that in some limited cases the users will have to explicitly
set
quoted
the new packet_mode flag to zero or one, in case they need to enable the
packet mode, but I think this is an acceptable cost because: (A) This API is
clearly marked as experimental; (B) It is better to take a small incremental
hit
quoted
now to keep the APIs in good order rather than taking a bit hit in a few
years as
quoted
more features are added in the wrong way and the APIs become
unmanageable.
I don't think that the current suggestion is in wrong way.
In any case, you insist, we will align.
Thank you.
quoted
quoted
And if we so, and we are going to break the API\ABI, we are going to
introduce new meter policy API soon and there, breaking API can help,
lets see in other discussion later.
Yes, as you point out API changes are unavoidable as new features are
added,
quoted
we have to manage the API evolution correctly.
quoted
One more point:
Currently, the meter_id is managed by the user, I think it is better
to let the PMDs to manage the meter_id.

Searching the PMD meter handler inside the PMD is very expensive for
the API call rate when the meter_id is managed by the user.

Same for profile_id.

Also all the rte_flow API including the shared action API taking the
PMD management approach.

What do you think?
Yes, we have carefully considered and discussed both approaches a few
years
quoted
back when the API was introduced, this is not done by accident :), there are
pros and cons for each of them.

If the object IDs are generated by the driver (outputs of the API), then it is
the
quoted
user application that needs to keep track of them, which can be very
painful.
quoted
Basically, for each API object the user application needs to create its own
wrapper to store this ID. We basically transfer this problem to the user app.
No exactly\not for all, the app gets the meter_id in the same time it decides
it now.
quoted
If the object IDs are generated by the user application (inputs into the API),
then we simplify the application by removing and indirection layer. Yes, it is
true that this indirection layer now moves into the driver, but we should try
to
quoted
make the life easier for the appl developers as opposed to us, the driver
developers. This indirection layer in the driver can be made a bit smarter
than
quoted
just a slow "for" loop; the search operation can be made faster with a small
bit
quoted
of effort, such as keeping this list sorted based on the object ID, splitting
this list
quoted
into buckets (similar to a hash table), etc, right?
Yes, there are even better solution than hash table from "rate" perspective.
I'd be very interested to hear your proposals here.
But any solution costs a lot of memory just for this mapping...
When we talked about 4M meters supported(in mlx5 next release) it
becomes an issue.
I thought your concern was about the speed/rate of API calls, you are saying it is not speed but memory footprint??

I would imagine that a system that enables all the 4M meters is a big beast with the most powerful CPU on the planet and many dozens of gigabytes of RAM, so a few extra megabytes for some API layers is not a concern?
quoted
Having the user app provide the object ID is especially important in the case
of
quoted
rte_tm API, where we have to deal with a tree of nodes, with thousands of
nodes for each level. Having the app to store and manages this tree of IDs
is a
quoted
really bad idea, as the user app needs to mirror the tree of nodes on its
side for
quoted
no real benefit. As an added benefit, the user can generate these IDs using
a
quoted
rule, such as: given the specific path through the tree, the value of the ID
can
quoted
be computed.
rte_tm is not rte_mtr - I think meter is different and used differently.
For example, as I know, no one from our dpdk meter customers(at least 5)
use TREEs for meter management. OVS, for example, just randomize some
meter_id and don't care about it.
What kinds of trees? I'd be very interested to hear some proposals to make this handle mapping faster.
Also, all the rte_flow API basics works with PMD ID\handle management
approach.
Yes, I am not saying it is wrong, none of the approaches is wrong IMO.
quoted
But again, as you also mention above, there is a list of pros and cons for
every
quoted
approach, no approach is perfect. We took this approach for the good
reasons
quoted
listed above.
If you familiar with TREE usage with meter, maybe we can combined easily
the two approaches in this topic,

meter_id argument can be by reference, if it 0 - PMD set it, if not PMD use it.
It would be good if you could elaborate here a bit, just to make sure we are on the same page.
quoted
quoted
quoted
quoted
quoted
The flag proposal is actually reducing the amount of work that
you guys
need to
quoted
do to implement your proposal. There is no negative impact to
your
proposal
quoted
and no big change, right?
Yes you right, but the implementation effect is not our concern.

quoted
quoted
quoted
This is a quick summary of the required API changes to add
support for the packet mode, they are minimal:
a) Introduce the packet_mode flag in the profile parameters
data
structure.
quoted
b) Change the description (comment) of the rate and bucket
size
parameters in
quoted
the meter profile parameters data structures to reflect that
their values represents either bytes or packets, depending
on the value of the new flag packet_mode from the same
structure.
quoted
quoted
quoted
quoted
quoted
quoted
quoted
c) Add the relevant capabilities: just search for "packet"
in the rte_tm.h capabilities data structures and apply the
same to the rte_mtr.h
capabilities,
quoted
when applicable.
quoted
Regards,
Cristian
Regards,
Cristian
Regards,
Cristian
Regards,
Cristian
Regards,
Cristian
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help