Re: TCP funny-ness when over-driving a 1Gbps link.
From: Rick Jones <hidden>
Date: 2011-05-20 00:46:21
On Thu, 2011-05-19 at 17:37 -0700, Ben Greear wrote:
On 05/19/2011 05:24 PM, Rick Jones wrote:quoted
quoted
quoted
quoted
[root@i7-965-1 igb]# netstat -an|grep tcp|grep 8.1.1 tcp 0 0 8.1.1.1:33038 0.0.0.0:* LISTEN tcp 0 0 8.1.1.1:33040 0.0.0.0:* LISTEN tcp 0 0 8.1.1.1:33042 0.0.0.0:* LISTEN tcp 0 9328612 8.1.1.2:33039 8.1.1.1:33040 ESTABLISHED tcp 0 17083176 8.1.1.1:33038 8.1.1.2:33037 ESTABLISHED tcp 0 9437340 8.1.1.2:33037 8.1.1.1:33038 ESTABLISHED tcp 0 17024620 8.1.1.1:33040 8.1.1.2:33039 ESTABLISHED tcp 0 19557040 8.1.1.1:33042 8.1.1.2:33041 ESTABLISHED tcp 0 9416600 8.1.1.2:33041 8.1.1.1:33042 ESTABLISHEDI take it your system has higher values for the tcp_wmem value: net.ipv4.tcp_wmem = 4096 16384 4194304Yes: [root@i7-965-1 igb]# cat /proc/sys/net/ipv4/tcp_wmem 4096 16384 50000000Why?!? Are you trying to get link-rate to Mars or something? (I assume tcp_rmem is similarly set...) If you are indeed doing one 1 GbE, and no more than 100ms then the default (?) of 4194304 should have been more than sufficient.Well, we occasionally do tests over emulated links that have several seconds of delay and may be running multiple Gbps. Either way, I'd hope that offering extra RAM to a subsystem wouldn't cause it to go nuts.
It has been my experience that the autotuning tends to grow things beyond the bandwidthXdelay product. As for several seconds of delay and multiple Gbps - unless you are shooting the Moon, sounds like bufferbloat?-)
Assuming this isn't some magical 1Gbps issue, you could probably hit the same problem with a wifi link and default tcp_wmem settings...
Do you also increase tx queue's for the NIC(s)? rick