Re: Initial thoughts on TXDP

From: Tom Herbert <hidden>
Date: 2016-12-01 20:39:06

On Thu, Dec 1, 2016 at 12:13 PM, Sowmini Varadhan
[off-list ref] wrote:

On (12/01/16 11:05), Tom Herbert wrote:

quoted

Polling does not necessarily imply that networking monopolizes the CPU
except when the CPU is otherwise idle. Presumably the application
drives the polling when it is ready to receive work.

I'm not grokking that- "if the cpu is idle, we want to busy-poll
and make it 0% idle"?  Keeping CPU 0% idle has all sorts
of issues, see slide 20 of
 http://www.slideshare.net/shemminger/dpdk-performance

quoted

and one other critical difference from the hot-potato-forwarding
model (the sort of OVS model that DPDK etc might aruguably be a fit for)
does not apply: in order to figure out the ethernet and IP headers
in the response correctly at all times (in the face of things like VRRP,
gw changes, gw's mac addr changes etc) the application should really
be listening on NETLINK sockets for modifications to the networking
state - again points to needing a select() socket set where you can
have both the I/O fds and the netlink socket,

I would think that that is management would not be implemented in a
fast path processing thread for an application.

sure, but my point was that *XDP and other stack-bypass methods needs
to provide a select()able socket: when your use-case is not about just
networking, you have to snoop on changes to the control plane, and update
your data path. In the OVS case (pure networking) the OVS control plane
updates are intrinsic to OVS. For the rest of the request/response world,
we need a select()able socket set to do this elegantly (not really
possible in DPDK, for example)

I'm not sure that TXDP can be reconciled to help OVS. The point of
TXDP is to drive applications closer to bare metal performance, as I
mentioned this is only going to be worth it if the fast path can be
kept simple and not complicated by a requirement for generalization.
It seems like the second we put OVS in we're doubling the data path
and accepting the performance consequences of a complex path anyway.

TXDP can't over the whole system (any more than DPDK can) and needs to
work in concert with other mechanisms-- the key is how to steer the
work amongst the CPUs. For instance, if a latency critical thread is
running on some CPU we either a dedicated queue for the connections of
the thread (e.g. ntuple filtering or aRFS support) or we need a fast
way to get move unrelated packets received on a queue processed by
that CPU to other CPUs (less efficient, but no special HW support is
needed either).

Tom

quoted

The *SOs are always an interesting question. They make for great
benchmarks, but in real life the amount of benefit is somewhat
unclear. Under the wrong conditions, like all cwnds have collapsed or

I think Rick's already bringing up this one.

--Sowmini

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help