Re: [RFC] Generic flow director/filtering/classification API
From: Lu, Wenzhuo <hidden>
Date: 2016-07-21 03:18:17
Hi Adrien,
-----Original Message----- From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com] Sent: Wednesday, July 20, 2016 6:41 PM To: Lu, Wenzhuo Cc: dev@dpdk.org; Thomas Monjalon; Zhang, Helin; Wu, Jingjing; Rasesh Mody; Ajit Khaparde; Rahul Lakkireddy; Jan Medala; John Daley; Chen, Jing D; Ananyev, Konstantin; Matej Vido; Alejandro Lucero; Sony Chacko; Jerin Jacob; De Lara Guarch, Pablo; Olga Shern Subject: Re: [RFC] Generic flow director/filtering/classification API Hi Wenzhuo, On Wed, Jul 20, 2016 at 02:16:51AM +0000, Lu, Wenzhuo wrote: [...]quoted
quoted
So, today an application cannot combine N-tuple and FDIR flow rules and get a reliable outcome, unless it is designed for specific devices with a known behavior.quoted
What's the right behavior of PMD if APP want to create a flow director rulewhich has a higher or even equal priority than an existing n-tuple rule? Should PMD return fail? First remember applications only deal with the generic API, PMDs are responsible for choosing the most appropriate HW implementation to use according to the requested flow rules (FDIR, N-tuple or anything else). For the specific case of FDIR vs N-tuple, if the underlying HW supports both I do not see why the PMD would create a N-tuple rule. Doesn't FDIR support everything N-tuple can do and much more?Talking about the filters, fdir can cover n-tuple. I think that's why i40e onlysupports fdir but not n-tuple. But n-tuple has its own highlight. As we know, at least on intel NICs, fdir only supports per device mask. But n-tuple can support per rule mask.quoted
As every pattern has spec and mask both, we cannot guarantee the masks aresame. I think ixgbe will try to use n-tuple first if can. Because even the masks are different, we can support them all. OK, makes sense. In that case existing rules may indeed prevent subsequent ones from getting created if their priority is wrong. I do not think there is a way around that if the application needs this exact ordering.
Agree. I don’t see any workaround either. PMD has to return fail sometimes.
quoted
quoted
Assuming such a thing happened anyway, that the PMD had to create a rule using a high priority filter type and that the application requests the creation of a rule that can only be done using a lower priority filter type, but also requested a higher priority for that rule, then yes,it should obviously fail.quoted
quoted
That is, unless the PMD can perform some kind of workaround to have both.quoted
If so, do we need more fail reasons? According to this RFC, I think we needreturn " EEXIST: collision with an existing rule. ", but it's not very clear, APP doesn't know the problem is priority, maybe more detailedreason is helpful.quoted
quoted
Possibly, I've defined a basic set of errors, there are quite a number of errno values to choose from. However I think we should notdefine too many values.quoted
quoted
In my opinion the basic set covers every possible failure: - EINVAL: invalid format, rule is broken or cannot be understood by the PMD anyhow. - ENOTSUP: pattern/actions look fine but something in the requested rule is not supported and thus cannot be applied. - EEXIST: pattern/actions are fine and could have been applied if only some other rule did not prevent the PMD to do it (I see it as the closest thing to "ETOOBAD" which unfortunately does not exist). - ENOMEM: like EEXIST, except it is due to the lack of resources not because of another rule. I wasn't sure which of ENOMEM or ENOSPC was better but settled on ENOMEM as it is well known. Still open to debate. Errno values are only useful to get a rough idea of the reason, and another mechanism is needed to pinpoint the exact problem for debugging/reporting purposes, something like: enum rte_flow_error_type { RTE_FLOW_ERROR_TYPE_NONE, RTE_FLOW_ERROR_TYPE_UNKNOWN, RTE_FLOW_ERROR_TYPE_PRIORITY, RTE_FLOW_ERROR_TYPE_PATTERN, RTE_FLOW_ERROR_TYPE_ACTION, }; struct rte_flow_error { enum rte_flow_error_type type; void *offset; /* Points to the exact pattern item or action. */ const char *message; };When we are using a CLI and it fails, normally it will let us know which parameter is not appropriate. So, I think it’s a good idea to have this error structure :)Agreed.quoted
quoted
Then either provide an optional struct rte_flow_error pointer to rte_flow_validate(), or a separate function (rte_flow_analyze()?), since processing this may be quite expensive and applications may not care about the exact reason.Agree the processing may be too expensive. Maybe we can say it's optional toreturn error details. And that's a good question that what APP should do if creating the rule fails. I believe normally it will choose handle the rule by itself. But I think it's not bad to feedback more. Or even the APP want to adjust the rules, it cannot be an option for lack of info. All right then, I'll add it to the specification. int rte_flow_validate(uint8_t port_id, const struct rte_flow_pattern *pattern, const struct rte_flow_actions *actions, struct rte_flow_error *error); With error possibly NULL if the application does not care. Is it fine for you?
Yes, it looks good to me. Thanks for that :)
[...]quoted
quoted
quoted
quoted
quoted
quoted
- PMDs, not applications, are responsible for maintaining flow rules configuration when stopping and restarting a port or performingotherquoted
quoted
quoted
quoted
quoted
quoted
actions which may affect them. They can only be destroyed explicitly.Don’t understand " They can only be destroyed explicitly."This part says that as long as an application has not called rte_flow_destroy() on a flow rule, it never disappears, whatever happens to the port (stopped, restarted). The application is not responsible for re-creating rules after that. Note that according to the specification, this may translate to not being able to stop a port as long as a flow rule is present, depending on how nice the PMD intends to be with applications. Implementation can be done in small steps with minimal amount of code onthe PMD side.quoted
Does it mean PMD should store and maintain all the rules? Why not let rte dothat? I think if PMD maintain all the rules, it means every kind of NIC should have a copy of code for the rules. But if rte do that, only one copy of code need to be maintained, right? I've considered having rules stored in a common format understood at the RTE level and not specific to each PMD and decided that the opaque rte_flow pointer was a better choice for the following reasons: - Even though flow rules management is done in the control path, processing must be as fast as possible. Letting PMDs store flow rules using their own internal representation gives them the chance to achieve better performance.Not quite understand. I think we're talking about maintain the rules by SW. Idon’t think there's something need to be optimized according to specific NICs. If we need to optimize the code, I think we need to consider the CPU, OS ... and some common means. I'm wrong? Perhaps we were talking about different things, here I was only explaining why rte_flow (the result of creating a flow rule) should be opaque and fully managed by the PMD. More on the SW side of things below.quoted
quoted
- An opaque context managed by PMDs would probably have to be stored somewhere as well anyway. - PMDs may not need to allocate/store anything at all if they exclusively rely on HW state for everything. In my opinion, the generic API has enough constraints for this to work and maintain consistency between flow rules. Note this is currently how most PMDs implement FDIR and other filter types.Yes, the rules are stored by HW. But considering stop/start the device, therules in HW will lose. we have to store the rules by SW and re-program them when restarting the device. Assume a HW capable of keeping flow rules programmed even during a stop/start cycle (e.g. mlx4/mlx5 may be able to do it from DPDK point of view), don't you think it is more efficient to standardize on this behavior and let PMDs restore flow rules for HW that do not support it regardless of whether it would be done by RTE or the application (SW)?
Didn’t know that. As some NICs have already had the ability to keep the rules during a stop/start cycle, maybe it could be a trend :)
quoted
And in existing code, we store the filters by SW at least on Intel NICs. But Ithink we cannot reuse them, because considering the priority and which category of filter should be chosen, I think we need a whole new table for generic API. I think it’s what's designed now, right? So I understand you'd want RTE to help your PMD keep track of the flow rules it created?
Yes. But as you said before, it’s not a good idea for mlx4/mlx5, because their HW doesn't need SW to re-program the rules after stopping/starting. If we make it a common mechanism, it just wastes time for mlx4/mlx5.
Nothing wrong with that, all I'm saying is that it should be entirely optional. RTE should not automatically maintain a list. PMDs have to call RTE helpers if they need help to maintain a context. These helpers are not defined in this API yet because it is difficult to know what will be useful in advance.quoted
quoted
- RTE can (and will) provide helpers to avoid most of the code redundancy, PMDs are free to use them or manage everything by themselves. - Given that the opaque rte_flow pointer associated with a flow rule is to be stored by the application, PMDs do not even have to keep references to them.Don’t understand. More details?In an application: rte_flow *foo = rte_flow_create(...); In the above example, foo cannot be dereferenced by the application nor RTE, only the PMD is aware of its contents. This object can only be used with rte_flow*() functions. PMDs are thus free to make this object grow as needed when adding internal features without breaking any kind of public API/ABI. What I meant is, given that the application is supposed to store foo somewhere in order to destroy it later, the PMD does not have to keep track of that pointer assuming it does not need to access it later on its own for some reason.quoted
quoted
- The flow rules format described in this specification (pattern / actions) will be used by applications directly, and will be free to arrange them in lists, trees or in any other way if they need to keep flow specifications around for further processing.Who will create the lists, trees or something else? According to previousdiscussion, I think the APP will program the rules one by one. So if APP organize the rules to lists, trees..., PMD doesn’t know that.quoted
And you said " Given that the opaque rte_flow pointer associated with a flowrule is to be stored by the application ". I'm lost here. I guess that's because we're discussing two different things, flow rule specifications and flow rule objects. Let me sum it up: - Flow rule specifications are the patterns/actions combinations provided by applications to rte_flow_create(). Applications can store those as needed and organize them as they wish (hash, tree, list). Neither PMDs nor RTE will do it for them. - Flow rule objects (struct rte_flow *) are generated when a flow rule is created. Applications must keep these around if they want to manipulate them later (i.e. destroy or query existing rules).
Thanks for this clarification. So the specifications can be different with objects, right? The specifications are what the APP wants, the objects are what the APP really gets. As rte_flow_create can fail. Right?
Then PMDs *may* need to keep and arrange flow rule objects internally for management purposes. Could be because HW requires it, detecting conflicting rules, managing priorities and so on. Possible reasons are not described in this API because these are thought as PMD-specific needs.
Got it.
quoted
quoted
quoted
When the port is stopped and restarted, rte can reconfigure the rules. Is theconcern that PMD may adjust the sequence of the rules according to the priority, so every NIC has a different list of rules? But PMD can adjust them again when rte reconfiguring the rules. What about PMDs able to stop and restart ports without destroying their own flow rules? If we assume flow rules must be destroyed when stopping a port, these PMDs are needlessly penalized with slower stop/start cycles. Think about it assuming thousands of flow rules.I believe the rules maintained by SW should not be destroyed, because they'reused to be re-programed when the device starts again. Do we agree that applications should not care? Flow rules configured before stopping a port must still be there after restarting it.
Yes, agree.
What we seem to not agree about is that you think RTE should be responsible for restoring flow rules of devices that lose them when stopped. I think doing so is unfair to devices for which it is not the case and not really nice to applications, so my opinion is that the PMD is responsible for restoring flow rules however it wants. It is free to use RTE helpers to keep their track, as long as it's all managed internally.
What I think is RTE can store the flow rules and recreate them after restarting, in the function like rte_dev_start, so APP knows nothing about it. But according to the discussing above, I think the design doesn't support it, right? RTE doesn't store the flow rules objects and event it stores them, there's no way designed to re-program the objects. And also considering some HW doesn't need to be re-programed. I think it's OK to let PMD maintain the rules as the re-programing is a NIC specific requirement.
quoted
quoted
Thus from an application point of view, whatever happens when stopping and restarting a port should not matter. If a flow rule was present before, it must still be present afterwards. If the PMD had to destroy flow rules and re-create them, it does not actually matter if theydiffer slightly at the HW level, as long as:quoted
quoted
- Existing opaque flow rule pointers (rte_flow) are still valid to the PMD and refer to the same rules. - The overall behavior of all rules is the same. The list of rules you think of (patterns / actions) is maintained by applications (not RTE), and only if they need them. RTE would needlesslyduplicate this.quoted
As said before, need more details to understand this. Maybe an example is better :)The generic format both RTE and applications might understand is the one described in this API (struct rte_flow_pattern and struct rte_flow_actions). If we wanted RTE to maintain some sort of per-port state for flow rule specifications, it would have to be a copy of these structures arranged somehow (list or something else). If we consider that PMDs need to keep a context object associated to a flow rule (the opaque struct rte_flow *), then RTE would most likely have to store it along with the flow specification. Such a list may not be useful to applications (list lookups take time), so they would implement their own redundant method. They might also require extra room to attach some application context to flow rules. A generic list cannot plan for it. Applications know what they want to do with flow rules and are responsible for managing them efficiently with RTE out of the way. I'm not sure if this answered your question, if not, please describe a scenario where a RTE-managed list of flow rules would be mandatory.
Got your point and agree :)
-- Adrien Mazarguil 6WIND