Thread (94 messages) 94 messages, 15 authors, 2015-12-09

Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload

From: John Fastabend <john.fastabend@gmail.com>
Date: 2015-12-08 07:34:08

On 15-12-02 04:15 PM, Tom Herbert wrote:
On Wed, Dec 2, 2015 at 3:35 PM, John Fastabend [off-list ref] wrote:
quoted
[...]
quoted
quoted
I wonder why we need protocol generic offloads? I know there are
currently a lot of overlay encapsulation protocols. Are there many more
coming?
Yes, and assume that there are more coming with an unbounded limit
(for instance I just noticed today that there is a netdev1.1 talk on
supporting GTP in the kernel). Besides, this problem space not just
limited to offload of encapsulation protocols, but how to generalize
offload of any transport, IPv[46], application protocols, protocol
implemented in user space, security protocols, etc.
quoted
Besides, this offload is about TSO and RSS and they do need to parse the
packet to get the information where the inner header starts. It is not
only about checksum offloading.
RSS does not require the device to parse the inner header. All the UDP
encapsulations protocols being defined set the source port to entropy
flow value and most devices already support RSS+UDP (just needs to be
enabled) so this works just fine with dumb NICs. In fact, this is one
of the main motivations of encapsulating UDP in the first place, to
leverage existing RSS and ECMP mechanisms. The more general solution
is to use IPv6 flow label (RFC6438). We need HW support to include the
flow label into the hash for ECMP and RSS, but once we have that much
of the motivation for using UDP goes away and we can get back to just
doing GRE/IP, IPIP, MPLS/IP, etc. (hence eliminate overhead and
complexity of UDP encap).
quoted
Please provide a sketch up for a protocol generic api that can tell
hardware where a inner protocol header starts that supports vxlan,
vxlan-gpe, geneve and ipv6 extension headers and knows which protocol is
starting at that point.
BPF. Implementing protocol generic offloads are not just a HW concern
either, adding kernel GRO code for every possible protocol that comes
along doesn't scale well. This becomes especially obvious when we
consider how to provide offloads for applications protocols. If the
kernel provides a programmable framework for the offloads then
application protocols, such as QUIC, could use use that without
needing to hack the kernel to support the specific protocol (which no
one wants!). Application protocol parsing in KCM and some other use
cases of BPF have already foreshadowed this, and we are working on a
prototype for a BPF programmable engine in the kernel. Presumably,
this same model could eventually be applied as the HW API to
programmable offload.
Just keying off the last statement there...

I think BPF programs are going to be hard to translate into hardware
for most devices. The problem is the BPF programs in general lack
structure. A parse graph would be much more friendly for hardware or
at minimum the BPF program would need to be a some sort of
well-structured program so a driver could turn that into a parse graph.
This might be relevant:
http://richard.systems/research/pdf/IEEE_HPSR_BPF_OPENFLOW.pdf
Thanks Tom interesting read but they seem to argue for a BPF engine in
hardware which I'm still not convinced is necessary and the numbers
provided are for a 1Gbps link where 10Gpbs/100Gbps+ would be more
valuable.

I am still leaning towards a fully programmable parse graph and a set
of basic actions push/pop/set/fwd/etc. This would be useful for other
features not just checksum offloads. I guess it doesn't necessarily
exclude also having 1s complement logic though.

.John
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help