Thread (260 messages) 260 messages, 21 authors, 2017-11-14

Re: [RFC] Generic flow director/filtering/classification API

From: Chandran, Sugesh <hidden>
Date: 2016-07-18 13:26:27

Hi Adrien,
Thank you for getting back on this.
Please find my comments below.

Regards
_Sugesh

-----Original Message-----
From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
Sent: Friday, July 15, 2016 4:04 PM
To: Chandran, Sugesh <redacted>
Cc: dev@dpdk.org; Thomas Monjalon <redacted>;
Zhang, Helin [off-list ref]; Wu, Jingjing
[off-list ref]; Rasesh Mody [off-list ref]; Ajit
Khaparde [off-list ref]; Rahul Lakkireddy
[off-list ref]; Lu, Wenzhuo [off-list ref];
Jan Medala [off-list ref]; John Daley [off-list ref]; Chen,
Jing D [off-list ref]; Ananyev, Konstantin
[off-list ref]; Matej Vido [off-list ref];
Alejandro Lucero [off-list ref]; Sony Chacko
[off-list ref]; Jerin Jacob
[off-list ref]; De Lara Guarch, Pablo
[off-list ref]; Olga Shern [off-list ref];
Chilikin, Andrey [off-list ref]
Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification
API

On Fri, Jul 15, 2016 at 09:23:26AM +0000, Chandran, Sugesh wrote:
quoted
Thank you Adrien,
Please find below for some more comments/inputs

Let me know your thoughts on this.
Thanks, stripping again non relevant parts.

[...]
quoted
quoted
quoted
quoted
quoted
[Sugesh] Is it a limitation to use only 32 bit ID? Is it
possible to have a
64 bit ID? So that application can use the control plane flow
pointer Itself as an ID. Does it make sense?
I've specified a 32 bit ID for now because this is what FDIR
supports and also what existing devices can report today AFAIK
(i40e and
mlx5).
quoted
quoted
We could use 64 bit for future-proofness in a separate action like
"ID64"
quoted
quoted
quoted
quoted
when at least one device supports it.

To PMD maintainers: please comment if you know devices that
support tagging matching packets with more than 32 bits of
user-provided data!
[Sugesh] I guess the flow director ID is 64 bit , The XL710 datasheet says
so.
quoted
quoted
quoted
And in the 'rte_mbuf' structure the 64 bit FDIR-ID is shared with
rss hash. This can be a software driver limitation that expose
only 32 bit. Possibly because of cache alignment issues? Since the
hardware can support 64 bit, I feel it make sense to support 64 bit as
well.
quoted
quoted
I agree we need 64 bit support, but then we also need a solution for
devices that support only 32 bit. Possible methods I can think of:

- A separate "ID64" action (or a "ID32" one, perhaps with a better name).

- A single ID action with an unlimited number of bytes to return with
  packets (would actually be a string). PMDs can then refuse to create
flow
quoted
quoted
  rules requesting an unsupported number of bytes. Devices
supporting fewer
  than 32 bits are also included this way without the need for yet another
  action.

Thoughts?
[Sugesh] I feel the single ID approach is much better. But I would say
a fixed size ID is easy to handle at upper layers. Say PMD returns
64bit ID in which MSBs are masked out, based on how many bits the
hardware can support.
quoted
PMD can refuse the unsupported number of bytes when requested. So
the
quoted
size of ID going to be a parameter to program the flow.
What do you think?
What you suggest if I am not mistaken is:

 struct rte_flow_action_id {
     uint64_t id;
     uint64_t mask; /* either a bit-mask or a prefix/suffix length? */  };

I think in this case a mask is more versatile than a prefix/suffix length as the
value itself comes in an unknown endian (from PMD's POV). It also allows
specific bits to be taken into account, like when HW only supports 32 bit, with
some black magic the full original 64 bit value can be restored as long as the
application only cares about at most 32 bits anywhere.

However I do not think many applications "won't care" about specific bits in a
given value and having to provide a properly crafted mask will be a hassle,
they will just fill it with ones and hope for the best. As a result they won't
take advantage of this feature or will stick to 32 bits all the time, or whatever
happens to be the least common denominator.

My previous suggestion was:

 struct rte_flow_action_id {
     uint8_t size; /* number of bytes in id[] */
     uint8_t id[];
 };

It does not solve the issue if an application requests more bytes than
supported, however as a string, there is no endianness ambiguity and these
bytes are copied as-is to the related mbuf field as if done through memcpy()
possibly with some padding to fill the entire 64 bit field (copied bytes thus
starting from MSB for big-endian machines, LSB for little-endian ones). The
value itself remains opaque to the PMD.

One issue is the flexible array approach makes static initialization more
complicated. Maybe it is not worth the trouble since according to Andrey,
even X710 reports at most 32 bits of user data.

So what should we do? Fixed 32 bits ID for now to keep things simple, then
another action for 64 bits later when necessary?
[Sugesh] I agree with you. We could keep things simple by having 32 bit ID now.
I mixed up the size of ID with flexible payload size. Sorry about that.
In the future, we could add an action for 64 bit if necessary.
quoted
quoted
[...]
quoted
quoted
quoted
[Sugesh] Another concern is the cost and time of installing
these rules in the hardware. Can we make these APIs time
bound(or at least an option
to
quoted
set the time limit to execute these APIs), so that Application
doesn’t have to wait so long when installing and deleting
flows
with
quoted
slow hardware/NIC. What do you think? Most of the datapath
flow
installations are
quoted
dynamic and triggered only when there is an ingress traffic.
Delay in flow insertion/deletion have unpredictable
consequences.

This API is (currently) aimed at the control path only, and must
indeed be assumed to be slow. Creating million of rules may take
quite long as it may involve syscalls and other time-consuming
synchronization things on the PMD side.

So currently there is no plan to have rules added from the data
path with time constraints. I think it would be implemented
through a different set of functions anyway.

I do not think adding time limits is practical, even specifying
in the API that creating a single flow rule must take less than
a maximum number of seconds in order to be effective is too much
of a constraint (applications that create all flows during init
may not care after
all).
quoted
quoted
You should consider in any case that modifying flow rules will
always be slower than receiving packets, there is no way around
that. Applications have to live with it and provide a software
fallback for incoming packets while managing flow rules.

Moreover, think about what happens when you hit the maximum
number
quoted
quoted
of flow rules and cannot create any more. Applications need to
implement some kind of fallback in their data path.

Offloading flows in HW is also only useful if they live much
longer than the time taken to create and delete them. Perhaps
applications may choose to do so after detecting long lived
flows such as TCP sessions.

You may have one separate control thread dedicated to manage
flows and keep your normal control thread unaffected by delays.
Several threads can even be dedicated, one per device.
[Sugesh] I agree that the flow insertion cannot be as fast as the
packet receiving rate.  From application point of view the problem
will be when hardware flow insertion takes longer than software
flow insertion. At least application has to know the cost of
inserting/deleting a rule in hardware beforehand. Otherwise how
application can choose the right flow candidate for hardware. My
point
here is application is expecting a deterministic behavior from a
classifier while inserting and deleting rules.

Understood, however it will be difficult to estimate, particularly
if a PMD must rearrange flow rules to make room for a new one due to
priority levels collision or some other HW-related reason. I mean,
spent time cannot be assumed to be constant, even PMDs cannot know
in advance because it also depends on the performance of the host CPU.

Such applications may find it easier to measure elapsed time for the
rules they create, make statistics and extrapolate from this
information for future rules. I do not think the PMD can help much here.
[Sugesh] From an application point of view this can be an issue.
Even there is a security concern when we program a short lived flow.
Lets consider the case,

1) Control plane programs the hardware with Queue termination flow.
2) Software dataplane programmed to treat the packets from the specific
queue accordingly.
quoted
3) Remove the flow from the hardware. (Lets consider this is a long wait
process..).
quoted
Or even there is a chance that hardware take more time to report the
status than removing it physically . Now the packets in the queue no longer
consider as matched/flow hit.
quoted
. This is due to the software dataplane update is yet to happen.
We must need a way to sync between software datapath and classifier
APIs even though they are both programmed from a different control
thread.
quoted
Are we saying these APIs are only meant for user defined static flows??
No, that is definitely not the intent. These are good points.

With the specified API, applications may have to adapt their logic and take
extra precautions in order to remain on the safe side at all times.

For your above example, the application cannot assume a rule is
added/deleted as long as the PMD has not completed the related operation,
which means keeping the SW rule/fallback in place in the meantime. Should
handle security concerns as long as after removing a rule, packets end up in a
default queue entirely processed by SW. Obviously this may worsen
response time.

The ID action can help with this. By knowing which rule a received packet is
associated with, processing can be temporarily offloaded by another thread
without much complexity.
[Sugesh] Setting ID for every flow may not viable especially when the size of ID
is small(just only 8 bits). I am not sure is this a valid case though.

How about a hardware flow flag in packet descriptor that set when the
packets hits any hardware rule. This way software doesn’t worry /blocked by a
hardware rule . Even though there is an additional overhead of validating this flag,
software datapath can identify the hardware processed packets easily.
This way the packets traverses the software fallback path until the rule configuration is
complete. This flag avoids setting ID action for every hardware flow that are configuring.
I think applications have to implement SW fallbacks all the time, as even
some sort of guarantee on the flow rule processing time may not be enough
to avoid misdirected packets and related security issues.
[Sugesh] Software fallback will be there always. However I am little bit confused on
the way software going to identify the packets that are already hardware processed . I feel we need some
notification in the packet itself, when a hardware rule hits. ID/flag/any other options?
Let's wait for applications to start using this API and then consider an extra
set of asynchronous / real-time functions when the need arises. It should not
impact the way rules are specified
[Sugesh] Sure. I think the rule definition may not impact with this.
.
quoted
quoted
quoted
quoted
quoted
[Sugesh] Another query is on the synchronization part. What if
same rules
are
quoted
handled from different threads? Is application responsible for
handling the
concurrent
quoted
hardware programming?
Like most (if not all) DPDK APIs, applications are responsible
for managing locking issues as decribed in 4.3 (Behavior). Since
this is a control path API and applications usually have a
single control thread, locking should not be necessary in most cases.

Regarding my above comment about using several control threads
to manage different devices, section 4.3 says:

 "There is no provision for reentrancy/multi-thread safety,
although nothing  should prevent different devices from being
configured at the same  time. PMDs may protect their control
path functions
accordingly."
quoted
quoted
I'd like to emphasize it is not "per port" but "per device",
since in a few cases a configurable resource is shared by several ports.
It may be difficult for applications to determine which ports
are shared by a given device but this falls outside the scope of this
API.
quoted
quoted
quoted
quoted
Do you think adding the guarantee that it is always safe to
configure two different ports simultaneously without locking
from the application side is necessary? In which case the PMD
would be responsible for locking shared resources.
[Sugesh] This would be little bit complicated when some of ports
are not under DPDK itself(what if one port is managed by Kernel)
Or ports are tied by different application. Locking in PMD helps
when the ports are accessed by multiple DPDK application. However
what if the port itself
not under DPDK?

Well, either we do not care about what happens outside of the DPDK
context, or PMDs must find a way to satisfy everyone. I'm not a fan
of locking either but it would be nice if flow rules configuration
could be attempted on different ports simultaneously without the
risk of wrecking anything, so that applications do not need to care.

Possible cases for a dual port device with global flow rule settings
affecting both ports:

1) ports 1 & 2 are managed by DPDK: this is the easy case, a rule that
needs
quoted
quoted
   to alter a global setting necessary for an existing rule on any port is
   not allowed (EEXIST). PMD must maintain a device context common to
both
quoted
quoted
   ports in order for this to work. This context is either under lock, or
   the first port on which a flow rule is created owns all future flow
   rules.

2) port 1 is managed by DPDK, port 2 by something else, the PMD is
aware of
quoted
quoted
   it and knows that port 2 may modify the global context: no flow rules
can
quoted
quoted
   be created from the DPDK application due to safety issues (EBUSY?).

3) port 1 is managed by DPDK, port 2 by something else, the PMD is
aware of
quoted
quoted
   it and knows that port 2 will not modify flow rules: PMD should not care,
   no lock necessary.

4) port 1 is managed by DPDK, port 2 by something else and the PMD is
not
quoted
quoted
   aware of it: either flow rules cannot be created ever at all, or we say
   it is user's reponsibility to make sure this does not happen.

Considering that most control operations performed by DPDK affect
the device regardless of other applications, I think 1) is the only
case that should be defined, otherwise 4), defined as user's
responsibility.

No more comments on this part? What do you suggest?
[Sugesh] I agree with your suggestions. I feel this is the best that can offer.
--
Adrien Mazarguil
6WIND
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help