Thread (260 messages) 260 messages, 21 authors, 2017-11-14

Re: [RFC] Generic flow director/filtering/classification API

From: Adrien Mazarguil <hidden>
Date: 2016-08-04 13:25:00

On Wed, Aug 03, 2016 at 12:11:56PM -0700, John Fastabend wrote:
[...]
quoted
quoted
quoted
quoted
quoted
quoted
The proposal looks very good.  It satisfies most of the features
supported by Chelsio NICs.  We are looking for suggestions on exposing
more additional features supported by Chelsio NICs via this API.

Chelsio NICs have two regions in which filters can be placed -
Maskfull and Maskless regions.  As their names imply, maskfull region
can accept masks to match a range of values; whereas, maskless region
don't accept any masks and hence perform a more strict exact-matches.
Filters without masks can also be placed in maskfull region.  By
default, maskless region have higher priority over the maskfull region.
However, the priority between the two regions is configurable.
I understand this configuration affects the entire device. Just to be clear,
assuming some filters are already configured, are they affected by a change
of region priority later?
Both the regions exist at the same time in the device.  Each filter can
either belong to maskfull or the maskless region.

The priority is configured at time of filter creation for every
individual filter and cannot be changed while the filter is still
active. If priority needs to be changed for a particular filter then,
it needs to be deleted first and re-created.
Could you model this as two tables and add a table_id to the API? This
way user space could populate the table it chooses. We would have to add
some capabilities attributes to "learn" if tables support masks or not
though.
This approach sounds interesting.
Now I understand the idea behind these tables, however from an application
point of view I still think it's better if the PMD could take care of flow
rules optimizations automatically. Think about it, PMDs have exactly a
single kind of device they know perfectly well to manage, while applications
want the best possible performance out of any device in the most generic
fashion.
The problem is keeping priorities in order and/or possibly breaking
rules apart (e.g. you have an L2 table and an L3 table) becomes very
complex to manage at driver level. I think its easier for the
application which has some context to do this. The application "knows"
if its a router for example will likely be able to pack rules better
than a PMD will.
I don't think most applications know they are L2 or L3 routers. They may not
know more than the pattern provided to the PMD, which may indeed end at a L2
or L3 protocol. If the application simply chooses a table based on this
information, then the PMD could have easily done the same.

I understand the issue is what happens when applications really want to
define e.g. L2/L3/L2 rules in this specific order (or any ordering that
cannot be satisfied by HW due to table constraints).

By exposing tables, in such a case applications should move all rules from
L2 to a L3 table themselves (assuming this is even supported) to guarantee
ordering between rules, or fail to add them. This is basically what the PMD
could have done, possibly in a more efficient manner in my opinion.

Let's assume two opposite scenarios for this discussion:

- App #1 is a command-line interface directly mapped to flow rules, which
  basically gets slow random input from users depending on how they want to
  configure their traffic. All rules differ considerably (L2, L3, L4, some
  with incomplete bit-masks, etc). All in all, few but complex rules with
  specific priorities.

- App #2 is something like OVS, creating and deleting a large number of very
  specific (without incomplete bit-masks) and mostly identical
  single-priority rules automatically and very frequently.

Actual applications will certainly be a mix of both.

For app #1, users would have to be aware of these tables and base their
filtering decisions according to them. Reporting tables capabilities, making
sure priorities between tables are well configured will be their
responsibility. Obviously applications may take care of these details for
them, but the end result will be the same. At some point, some combination
won't be possible. Getting there was only more complicated from
users/applications point of view.

For app #2 if the first rule can be created then subsequent rules shouldn't
be a problem until their number reaches device limits. Selecting the proper
table to use for these can easily be done by the PMD.
quoted
quoted
quoted
I don't see how the PMD can sort this out in any meaningful way and it
has to be exposed to the application that has the intelligence to 'know'
priorities between masks and non-masks filters. I'm sure you could come
up with something but it would be less than ideal in many cases I would
guess and we can't have the driver getting priorities wrong or we may
not get the correct behavior.
It may be solved by having the PMD maintain a SW state to quickly know which
rules are currently created and in what state the device is so basically the
application doesn't have to perform this work.

This API allows applications to express basic needs such as "redirect
packets matching this pattern to that queue". It must not deal with HW
details and limitations in my opinion. If a request cannot be satisfied,
then the rule cannot be created. No help from the application must be
expected by PMDs, otherwise it opens the door to the same issues as the
legacy filtering APIs.
This depends on the application and what/how it wants to manage the
device. If the application manages a pipeline with some set of tables,
then mapping this down to a single table, which then the PMD has to
unwind back to a multi-table topology to me seems like a waste.
Of course, only I am not sure applications will behave differently if they
are aware of HW tables. I fear it will make things more complicated for
them and they will just stick with the most capable table all the time, but
I agree it should be easier for PMDs.
quoted
[...]
quoted
quoted
quoted
Unfortunately, our maskfull region is extremely small too compared to
maskless region.
To me this means a userspace application would want to pack it
carefully to get the full benefit. So you need some mechanism to specify
the "region" hence the above table proposal.
Right. Makes sense.
I do not agree, applications should not be aware of it. Note this case can
be handled differently, so that rules do not have to be moved back and forth
between both tables. If the first created rule requires a maskfull entry,
then all subsequent rules will be entered into that table. Otherwise no
maskfull entry can be created as long as there is one maskless entry. When
either table is full, no more rules may be added. Would that work for you?
Its not about mask vs no mask. The devices with multiple tables that I
have don't have this mask limitations. Its about how to optimally pack
the rules and who implements that logic. I think its best done in the
application where I have the context.

Is there a way to omit the table field if the PMD is expected to do
a best effort and add the table field if the user wants explicit
control over table mgmt. This would support both models. I at least
would like to have explicit control over rule population in my pipeline
for use cases where I'm building a pipeline on top of the hardware.
Yes that's a possibility. Perhaps the table ID to use could be specified as
a meta pattern item? We'd still need methods to report how many tables exist
and perhaps some way to report their limitations, these could be later
through a separate set of functions.

[...]
quoted
quoted
quoted
For this adding a meta-data item seems simplest to me. And if you want
to make the default to be only a single port that would maybe make it
easier for existing apps to port from flow director. Then if an
application cares it can create a list of ports if needed.
Agreed.
However although I'm not opposed to adding dedicated meta items, remember
applications will not automatically benefit from the increased performance
if a single PMD implements this feature, their maintainers will probably not
bother with it.
Unless as we noted in other thread the application is closely bound to
its hardware for capability reasons. In this case it would make sense
to implement.
Sure.

[...]

-- 
Adrien Mazarguil
6WIND
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help