Tuesday, May 8, 2012

Increase initcwnd for TCP Performance

MTU & MSS
MTU (maximum transmission unit): Nearly all IP over Ethernet implementations use the Ethernet V2 frame format, which is 1500 bytes. Linux ifconfig output can show MTU size.

MSS (The maximum segment size)  is a parameter of the TCP protocol that specifies the largest amount of data, specified in octets, that a computer or communications device can receive in a single TCP segment, and therefore in a single IP datagram. It does not count the TCP header or the IP header.

Therefore, TCP/IP Headers + MSS ≤ MTU
MTU = 1500
TCP Header = 20
IP Header = 20
TCP Option = 12 (optional)
MSS = 1460 (1448 if there is TCP Option)

Two TCP Windows
Congestion Window (cwnd) controls the number of packets a TCP flow may have in the network in any given time. cwnd is dynamically adapting to changing network condition. TCP Slow-start is one of the algorithms that TCP uses in its quest to control congestion inside the network and it is also known as the exponential growth phase.When TCP reaches a certain threshold (also known as sstrsesh) it will enter the linear growth, Congestion avoidance. Linux 2.6.39 increased the initial congestion window to 10 packets, previous versions are 3.

The Receiver Advertised Window (rwnd) is the buffer size sent in each ACK from TCP receiver to TCP sender. The window size is 65535 (64K) bytes on Windows/Mac/iOS.

The purpose of sliding window is to prevent from the sender to send too many packets to over flow the network resource or the receiver's buffer. The "sliding window size" is the maximum amount of data we can send without having to wait for ACK.

Therefore, with suggested 10 initcwnd size, sliding window can have 10 * MSS data flowing the network without ACK, it will definitely improve TCP performance, eliminating TCP slow start.

How to Change initcwnd/initrwnd
ip route show
sudo ip route change default via 192.168.1.1 dev eth0 initcwnd 10
ip route show

ip route show
sudo ip route change default via 192.168.1.1 dev eth0 initrwnd 10
ip route show

Notes:
1. The advertised receive window on Linux is called initrwnd. It can only be adjusted on linux kernel 2.6.33 and newer
2. This changes the initcwnd and initrwnd until the next reboot.
3. To persist the changes, try out below script (copied from cdnplanet.com, but not verified)
cp /etc/sysconfig/network-scripts/ifup-post /etc/sysconfig/network-scripts/ifup-post.bak; sed -i -e "/^exit 0/d" /etc/sysconfig/network-scripts/ifup-post; echo "ip route change " $(ip route show | grep '^default' | sed 's/initcwnd [0-9]*//') " initcwnd 10" >> /etc/sysconfig/network-scripts/ifup-post; echo "exit 0" >> /etc/sysconfig/network-scripts/ifup-post

Other Tunings
disable net.ipv4.tcp_slow_start_after_idle
setting tcp_slow_start_after_idle to 0 (for disabling it) to speed up initial connections, otherwise it will cause your keepalive connection to return to slow start after TCP_TIMEOUT_INIT (3 seconds)

disable Nagle algorithm
TCP implementations usually provide applications with an interface to disable the Nagle algorithm. This is typically called the TCP_NODELAY option.

Other options on CentOS
sysctl -w net.core.rmem_max=16777216
sysctl -w net.core.wmem_max=16777216
sysctl -w net.ipv4.tcp_rmem=4096 87380 16777216
sysctl -w net.ipv4.tcp_wmem=4096 87380 16777216
sysctl -w net.core.netdev_max_backlog=30000
sysctl -w net.ipv4.tcp_congestion_control=htcp

Reference
http://kernelnewbies.org/Linux_2_6_39
http://www.cdnplanet.com/blog/tune-tcp-initcwnd-for-optimum-performance/
http://monolight.cc/2010/12/increasing-tcp-initial-congestion-window/
http://www.osischool.com/protocol/Tcp/slidingWindow/index.php

No comments:

Post a Comment