Thread (33 messages) 33 messages, 8 authors, 2011-06-21

Re: Linux TCP's Robustness to Multipath Packet Reordering

From: Eric Dumazet <hidden>
Date: 2011-04-25 11:25:07

Le lundi 25 avril 2011 à 12:37 +0200, Dominik Kaspar a écrit :
Hello,

Knowing how critical packet reordering is for standard TCP, I am
currently testing how robust Linux TCP is when packets are forwarded
over multiple paths (with different bandwidth and RTT). Since Linux
TCP adapts its "dupAck threshold" to an estimated level of packet
reordering, I expect it to be much more robust than a standard TCP
that strictly follows the RFCs. Indeed, as you can see in the
following plot, my experiments show a step-wise adaptation of Linux
TCP to heavy reordering. After many minutes, Linux TCP finally reaches
a data throughput close to the perfect aggregated data rate of two
paths (emulated with characteristics similar to IEEE 802.11b (WLAN)
and a 3G link (HSPA)):

http://home.simula.no/~kaspar/static/mptcp-emu-wlan-hspa-00.png

Does anyone have clues what's going on here? Why does the aggregated
throughput increase in steps? And what could be the reason it takes
minutes to adapt to the full capacity, when in other cases, Linux TCP
adapts much faster (for example if the bandwidth of both paths are
equal). I would highly appreciate some advice from the netdev
community.

Implementation details:
This multipath TCP experiment ran between a sending machine with a
single Ethernet interface (eth0) and a client with two Ethernet
interfaces (eth1, eth2). The machines are connected through a switch
and tc/netem is used to emulate the bandwidth and RTT of both paths.
TCP connections are established using iperf between eth0 and eth1 (the
primary path). At the sender, an iptables' NFQUEUE is used to "spoof"
the destination IP address of outgoing packets and force some to
travel to eth2 instead of eth1 (the secondary path). This multipath
scheduling happens in proportion to the emulated bandwidths, so if the
paths are set to 500 and 1000 KB/s, then packets are distributed in a
1:2 ratio. At the client, iptables' RAWDNAT is used to translate the
spoofed IP addresses back to their original, so that all packets end
up at eth1, although a portion actually travelled to eth2. ACKs are
not scheduled over multiple paths, but always travel back on the
primary path. TCP does not notice anything of the multipath
forwarding, except the side-effect of packet reordering, which can be
huge if the path RTTs are set very differently.
Hi Dominik

Implementation details of the tc/netem stages are important to fully
understand how TCP stack can react.

Is TSO active at sender side for example ?

Your results show that only some exceptional events make bandwidth
really change.

A tcpdump/pcap of ~10.000 first packets would be nice to provide (not on
mailing list, but on your web site)


Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help