Re: [RFC] Generic flow director/filtering/classification API
From: John Fastabend <john.fastabend@gmail.com>
Date: 2016-08-10 16:46:53
On 16-08-10 06:37 AM, Adrien Mazarguil wrote:
On Tue, Aug 09, 2016 at 02:47:44PM -0700, John Fastabend wrote:quoted
On 16-08-04 06:24 AM, Adrien Mazarguil wrote:quoted
On Wed, Aug 03, 2016 at 12:11:56PM -0700, John Fastabend wrote:[...]quoted
quoted
quoted
The problem is keeping priorities in order and/or possibly breaking rules apart (e.g. you have an L2 table and an L3 table) becomes very complex to manage at driver level. I think its easier for the application which has some context to do this. The application "knows" if its a router for example will likely be able to pack rules better than a PMD will.I don't think most applications know they are L2 or L3 routers. They may not know more than the pattern provided to the PMD, which may indeed end at a L2 or L3 protocol. If the application simply chooses a table based on this information, then the PMD could have easily done the same.But when we start thinking about encap/decap then its natural to start using this interface to implement various forwarding dataplanes. And one common way to organize a switch is into a TEP, router, switch (mac/vlan), ACL tables, etc. In fact we see this topology starting to show up in the NICs now. Further each table may be "managed" by a different entity. In which case the software will want to manage the physical and virtual networks separately. It doesn't make sense to me to require a software aggregator object to marshal the rules into a flat table then for a PMD to split them apart again.OK, my point was mostly about handling basic cases easily and making sure applications do not have to bother with petty HW details when they do not want to, yet still get maximum performance by having the PMD make the most appropriate choices automatically. You've convinced me that in many cases PMDs won't be able to optimize efficiently and that conscious applications will know better. The API has to provide the ability to do so. I think it's fine as long as it is not mandatory.
Great. I also agree making table feature _not_ mandatory for many use cases will be helpful. I'm just making sure we get all the use cases I know of covered.
quoted
quoted
I understand the issue is what happens when applications really want to define e.g. L2/L3/L2 rules in this specific order (or any ordering that cannot be satisfied by HW due to table constraints). By exposing tables, in such a case applications should move all rules from L2 to a L3 table themselves (assuming this is even supported) to guarantee ordering between rules, or fail to add them. This is basically what the PMD could have done, possibly in a more efficient manner in my opinion.I disagree with the more efficient comment :) If the software layer is working on L2/TEP/ACL/router layers merging them just to pull them back apart is not going to be more efficient.Moving flow rules around cannot be efficient by definition, however I think that attempting to describe table capabilities may be as complicated as describing HW bit-masking features. Applications may get it wrong as a result while a PMD would not make any mistake. Your use case is valid though, if the application already groups rules, then sharing this information with the PMD would make sense from a performance standpoint.quoted
quoted
Let's assume two opposite scenarios for this discussion: - App #1 is a command-line interface directly mapped to flow rules, which basically gets slow random input from users depending on how they want to configure their traffic. All rules differ considerably (L2, L3, L4, some with incomplete bit-masks, etc). All in all, few but complex rules with specific priorities.Agree with this and in this case the application should be behind any network physical/virtual and not giving rules like encap/decap/etc. This application either sits on the physical function and "owns" the hardware resource or sits behind a virtual switch.quoted
- App #2 is something like OVS, creating and deleting a large number of very specific (without incomplete bit-masks) and mostly identical single-priority rules automatically and very frequently.Maybe for OVS but not all virtual switches are built with flat tables at the bottom like this. Nor is it optimal it necessarily optimal. Another application (the one I'm concerned about :) would be build as a pipeline, something like ACL -> TEP -> ACL -> VEB -> ACL If I have hardware that supports a TEP hardware block an ACL hardware block and a VEB block for example I don't want to merge my control plane into a single table. The merging in this case is just pure overhead/complexity for no gain.It could be done by dedicating priority ranges for each item in the pipeline but then it would be clunky. OK then, let's discuss the best approach to implement this. [...]quoted
quoted
quoted
Its not about mask vs no mask. The devices with multiple tables that I have don't have this mask limitations. Its about how to optimally pack the rules and who implements that logic. I think its best done in the application where I have the context. Is there a way to omit the table field if the PMD is expected to do a best effort and add the table field if the user wants explicit control over table mgmt. This would support both models. I at least would like to have explicit control over rule population in my pipeline for use cases where I'm building a pipeline on top of the hardware.Yes that's a possibility. Perhaps the table ID to use could be specified as a meta pattern item? We'd still need methods to report how many tables exist and perhaps some way to report their limitations, these could be later through a separate set of functions.Sure I think a meta pattern item would be fine or put it in the API call directly, something like rte_flow_create(port_id, pattern, actions); rte_flow_create_table(port_id, table_id, pattern, actions);I suggest using a common method for both cases, either seems fine to me, as long as a default table value can be provided (zero) when applications do not care.
Works for me just use zero as the default when the application has no preference and expects PMD to do the table mapping.
Now about tables management, I think there is no need to not expose table capabilities (in case they have different capabilities) but instead provide guidelines as part of the specification to encourage applications writers to group similar rules in tables. A previously discussed, flow rules priorities would be specific to the table they are affected to.
This seems sufficient to me.
Like flow rules, table priorities could be handled through their index with index 0 having the highest priority. Like flow rule priorities, table indices wouldn't have to be contiguous. If this works for you, how about renaming "tables" to "groups"?
Works for me. And actually I like renaming them "groups" as this seems more neutral to how the hardware actually implements a group. For example I've worked on hardware with multiple Tunnel Endpoint engines but we exposed it as a single "group" to simplify the user interface. .John