Thread (10 messages) 10 messages, 6 authors, 2015-02-02

Re: [PATCH net-next v2 0/6] net: Add STT support.

From: Andy Gospodarek <hidden>
Date: 2015-01-30 18:44:51

On Thu, Jan 29, 2015 at 09:03:14PM -0800, Pravin Shelar wrote:
On Thu, Jan 29, 2015 at 8:17 PM, Tom Herbert [off-list ref] wrote:
quoted
On Thu, Jan 29, 2015 at 8:04 PM, Pravin Shelar [off-list ref] wrote:
quoted
On Thu, Jan 29, 2015 at 7:46 PM, Alexander Duyck
[off-list ref] wrote:
quoted
On 01/29/2015 03:29 PM, Pravin B Shelar wrote:
quoted
Following patch series adds support for Stateless Transport
Tunneling protocol.
STT uses TCP segmentation offload available in most of NIC. On
packet xmit STT driver appends STT header along with TCP header
to the packet. For GSO packet GSO parameters are set according
to tunnel configuration and packet is handed over to networking
stack. This allows use of segmentation offload available in NICs

The protocol is documented at
http://www.ietf.org/archive/id/draft-davie-stt-06.txt

I will send out OVS userspace patch on ovs-dev mailing list.

Following are test results. All tests are done on net-next with
STT and VXLAN kernel device without OVS.

Single Netperf session:
=======================
VXLAN:
    CPU utilization
     - Send local: 1.26
     - Recv remote: 8.62
    Throughput: 4.9 Gbit/sec
STT:
    CPU utilization
     - Send local: 1.01
     - Recv remote: 1.8
    Throughput: 9.45 Gbit/sec

Five Netperf sessions:
======================
VXLAN:
    CPU utilization
     - Send local: 9.7
     - Recv remote: 70 (varies from 60 to 80)
    Throughput: 9.05 Gbit/sec
STT:
    CPU utilization
     - Send local: 5.85
     - Recv remote: 14
    Throughput: 9.47 Gbit/sec
What does the small packet or non-TCP performance look like for STT vs
VXLAN?  My concern is that STT looks like it is a one trick pony since
all your numbers show is TCP TSO performance, and based on some of the
comments in your patches it seems like other protocols such as UDP are
going to suffer pretty badly due to things like the linearization overhead.
Current implementation is targeted for TCP workloads thats why I
posted numbers with TCP, once UDP is optimized we can discuss UDP
numbers. I am pretty sure the STT code can be optimized further
specially for protocols other than TCP.
--
There are many TCP workloads that use small packets, it is critical to
test for these also. E.g. "super_netperf 200 -H <addr> -l 120 -t
TCP_RR -- -r 1,1"
I have not tried it on STT device, I will collect those numbers.
quoted
Please provide the *exact* commands that you are using to configure
stt for optimal performance.
To create STT tunnel device.
`ip link add stt1  type stt key 1 remote 1.1.2.128`

No other configuration is needed.
Thanks for posting some performance numbers with your patch.  I also
don't want to 'pile on' with additional complaints, but I do have one
request.

Can you share any specs (including number of cores and NIC hardware
used) for the systems that gave you the above results?   If you do not
want to endorse a particular NIC that is fine --  I'm mostly curious how
many cores were used and if UDP and TCP RSS were both being used in this
configuration.

Thanks!
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help