Network Performance Tuning

⭐ Recommended Bybit $30,000 Welcome Bonus

This article details how to tune your Linux server's network stack for optimal performance, focusing on TCP tuning, MTU, network buffer sizes, and congestion control algorithms. This is crucial for applications sensitive to latency and throughput, such as web servers, database servers, and file transfer services.

Prerequisites

Before proceeding, ensure you have:

Root or sudo privileges on your Linux server.
Basic understanding of networking concepts (IP addresses, ports, TCP/UDP).
SSH access to your server.
Familiarity with the `sysctl` command.
Knowledge of your network infrastructure (router, switch configurations) to understand potential MTU limitations.

Understanding Network Performance Factors

Network performance is influenced by several factors, including:

Bandwidth: The maximum rate at which data can be transferred.
Latency: The time it takes for a data packet to travel from source to destination.
Packet Loss: The percentage of data packets that fail to reach their destination.
Jitter: The variation in packet delay.
TCP Window Size: The amount of data that can be sent before an acknowledgment is received.
MTU (Maximum Transmission Unit): The largest packet size, in bytes, that can be transmitted over a network interface.

Tuning these parameters can significantly improve the responsiveness and throughput of your server.

Checking Current Network Settings

It's essential to know your current settings before making changes.

Checking TCP Parameters

You can view current TCP-related kernel parameters using `sysctl`.

sudo sysctl -a | grep net.ipv4.tcp

Example Output Snippet:

net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_fin_timeout = 60
net.ipv4.tcp_keepalive_time = 7200
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_rmem = 4096	16384	4194304
net.ipv4.tcp_wmem = 4096	16384	4194304
net.ipv4.tcp_congestion_control = cubic

Explanation:

`tcp_rmem` and `tcp_wmem`: Define the minimum, default, and maximum receive and send buffer sizes for TCP sockets, respectively. The default values are often conservative.
`tcp_congestion_control`: Specifies the TCP congestion control algorithm in use. `cubic` is the default on many modern Linux distributions.

Checking MTU

The MTU can be checked using the `ip` command.

ip link show

Example Output Snippet:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 08:00:27:a1:b2:c3 brd ff:ff:ff:ff:ff:ff

Explanation: Look for your primary network interface (e.g., `eth0`, `ens18`). The `mtu` value indicates the maximum packet size. A common value for Ethernet is 1500.

Checking Current Congestion Control Algorithm

As seen in the `sysctl` output, you can also check this directly:

sysctl net.ipv4.tcp_congestion_control

Tuning TCP Buffer Sizes

Increasing TCP receive and send buffer sizes can allow for larger data transfers without waiting for acknowledgments, which is beneficial for high-bandwidth, high-latency connections (often found in WANs).

Determining Optimal Buffer Sizes

The optimal buffer size depends on your network's bandwidth-delay product (BDP). BDP = Bandwidth (bits/sec) * Round-Trip Time (seconds). The TCP window size should ideally be at least the BDP to keep the pipe full.

Bandwidth: Measured in bits per second. For example, 1 Gbps = 1,000,000,000 bps.
Round-Trip Time (RTT): Measured in seconds. You can estimate this using `ping`.

Example Calculation: If your bandwidth is 100 Mbps (100,000,000 bps) and your RTT is 50 ms (0.05 seconds): BDP = 100,000,000 bps * 0.05 s = 5,000,000 bits Convert to bytes: 5,000,000 bits / 8 bits/byte = 625,000 bytes. So, a TCP window size of at least 625 KB would be beneficial.

Applying Temporary Changes

You can temporarily change buffer sizes using `sysctl`. These changes will revert upon reboot.

# Set receive buffer sizes (min, default, max) in bytes
sudo sysctl -w net.ipv4.tcp_rmem="4096 87380 6291456"

# Set send buffer sizes (min, default, max) in bytes
sudo sysctl -w net.ipv4.tcp_wmem="4096 16384 6291456"

Explanation: We've increased the maximum values significantly. The exact values should be based on your BDP calculation. For a 1 Gbps link with 50ms RTT, the BDP is approximately 6.25 MB (6,291,456 bytes).

Applying Permanent Changes

To make these changes persistent, edit the `sysctl` configuration file.

1. Open the file:

sudo nano /etc/sysctl.conf

2. Add or modify the following lines:

    net.ipv4.tcp_rmem = 4096 87380 6291456
    net.ipv4.tcp_wmem = 4096 16384 6291456

3. Save and close the file. 4. Apply the changes without rebooting:

sudo sysctl -p

Troubleshooting:

"Bad value" error: Ensure you are using the correct format (min default max) and that the values are within reasonable kernel limits. You might need to check your kernel's configuration.
No improvement or degradation: The default values might be sufficient for your network, or the bottleneck might be elsewhere (CPU, disk I/O, application).

Tuning MTU

The MTU dictates the largest packet size. If the MTU is not consistent across a network path (e.g., between your server and a client), packets might be fragmented, leading to performance degradation and potential connectivity issues.

Finding the Optimal MTU

The optimal MTU is often determined by the smallest MTU along the entire path between two communicating hosts. This is known as the "Path MTU". You can discover this using the "Path MTU Discovery" (PMTUD) mechanism, but it can be unreliable. A more practical approach is to test common MTU values.

**1500:** Standard Ethernet MTU.
**1472:** Often used for PPPoE connections (1500 - 28 bytes for IP/ICMP headers).
**9000:** Jumbo frames, typically used in high-performance data center networks.

Testing MTU

You can test MTU by sending `ping` packets with the "do not fragment" flag set and varying sizes.

1. Test with a specific MTU value (e.g., 1500):

    # Ping a known host (e.g., google.com or your gateway)
    # -M do: do not fragment
    # -s: size of data payload
    # The total packet size will be size + 28 (IP and ICMP headers)
    ping -M do -s 1472 8.8.8.8

   If this succeeds, try a slightly larger size. If it fails, try a smaller size.

2. To find the largest working size for a given MTU (e.g., 1500):

   You can use a script or iteratively test. For example, to test for an MTU of 1500 (meaning payload of 1472):

    ping -M do -s 1472 8.8.8.8

   If this works, you know 1500 is a viable MTU for that path. If it fails, reduce the size (e.g., `-s 1400` for an MTU of 1428).

Troubleshooting:

"Packet needs to be fragmented but DF set": This means the MTU is too large for the path. Reduce the `-s` value in your `ping` command.
Ping fails even with small sizes: The destination might be blocking ICMP echo requests, or there's a more fundamental network issue. Try a different target.

Applying MTU Changes

Once you've determined the optimal MTU, apply it to your network interface.

1. Apply temporarily:

sudo ip link set dev eth0 mtu 1500

   Replace `eth0` with your interface name and `1500` with your chosen MTU.

2. Apply permanently:

   This depends on your distribution and network management tools.
   *   `netplan` (Ubuntu 18.04+):
       Edit your Netplan configuration file (e.g., `/etc/netplan/00-installer-config.yaml`).

        network:
          ethernets:
            eth0:
              dhcp4: no
              addresses: [192.168.1.100/24]
              mtu: 1500

       Then run:

sudo netplan apply

   *   `ifupdown` (Debian/older Ubuntu):
       Edit `/etc/network/interfaces`.

        auto eth0
        iface eth0 inet static
            address 192.168.1.100
            netmask 255.255.255.0
            gateway 192.168.1.1
            mtu 1500

       Then restart the interface:

sudo ifdown eth0 && sudo ifup eth0

   *   `NetworkManager` (CentOS/RHEL/Fedora):
       Use `nmcli`.

sudo nmcli connection modify eth0 mtu 1500

       Then restart the connection:

sudo nmcli connection down eth0 && sudo nmcli connection up eth0

Security Implications:

Incorrect MTU settings can lead to packet fragmentation, which can be exploited in some network attacks (e.g., fragmentation attacks). Ensure your firewall rules handle fragmented packets appropriately.
Using Jumbo Frames (MTU 9000) requires that all devices in the network path (switches, routers, NICs) support and are configured for it. Mismatched MTUs can cause connectivity issues.

TCP Congestion Control Algorithms

Congestion control algorithms manage how TCP sends data to avoid overwhelming the network. Different algorithms perform better in different network conditions.

Common Algorithms

Cubic: The default on many modern Linux systems. It's designed to perform well on high-speed, high-latency networks.
BBR (Bottleneck Bandwidth and Round-trip propagation time): Developed by Google. Aims to improve throughput and reduce latency by modeling the network path. It can be very effective but sometimes causes issues with certain network devices.
Reno/NewReno: Older algorithms, generally less effective on modern high-speed networks.

Checking Availability

You can see which algorithms are available on your system:

ls /proc/sys/net/ipv4/tcp_available_congestion_control

Example Output:

cubic bbr

Enabling and Using BBR

BBR can often provide significant performance improvements, especially on links with high latency or packet loss.

1. Ensure BBR is available:

   Check the output of `ls /proc/sys/net/ipv4/tcp_available_congestion_control`. If `bbr` is listed, you can proceed.

2. Enable BBR:

   You need to set both `tcp_congestion_control` and `tcp_available_congestion_control`.

   Temporarily:

    sudo sysctl -w net.ipv4.tcp_congestion_control=bbr
    sudo sysctl -w net.core.default_qdisc=fq

   Explanation:
   *   `net.ipv4.tcp_congestion_control=bbr`: Sets BBR as the active algorithm.
   *   `net.core.default_qdisc=fq`: Sets the default queueing discipline to `fq` (Fair Queue), which is recommended for BBR.

   Permanently:
   Edit `/etc/sysctl.conf`:

    sudo nano /etc/sysctl.conf

   Add these lines:

    net.core.default_qdisc = fq
    net.ipv4.tcp_congestion_control = bbr

   Apply the changes:

sudo sysctl -p

   Troubleshooting:
   *   Connectivity Issues after enabling BBR: Some older network hardware or specific configurations might not play well with BBR. If you experience issues, revert to `cubic`.
   *   No noticeable improvement: BBR's effectiveness is highly dependent on network conditions. If your network is already highly optimized or has very low latency, the gains might be minimal.
   *   `fq` not available: Ensure your kernel supports `fq`. It's standard on most modern kernels.

Network Buffer Tuning for Specific Applications

While `tcp_rmem` and `tcp_wmem` are global, some applications might benefit from their own specific buffer tuning. For example, applications that stream large amounts of data might have their own socket buffer settings.

Nginx: The `client_body_buffer_size` and `large_client_header_buffers` directives can influence how Nginx handles client requests.
Databases: Database servers often have their own network-related tuning parameters in their configuration files.

Refer to the documentation of your specific applications for details on their network tuning options.

Performance Benchmarking

After making changes, it's crucial to benchmark your network performance to verify improvements and identify regressions.

`iperf3`: A popular tool for measuring network bandwidth.

   *   Server side:

iperf3 -s

   *   Client side:

iperf3 -c your_server_ip

   You can perform tests with different parameters, e.g., for TCP and UDP:

iperf3 -c your_server_ip -t 30 -P 4

   (Test for 30 seconds, 4 parallel streams)

`ping`: For latency and packet loss.

ping -c 100 your_server_ip

`mtr`: Combines `ping` and `traceroute` to diagnose network issues.

mtr your_server_ip

Benchmarking Workflow: 1. Run benchmarks with current settings. Record baseline results. 2. Apply one set of changes (e.g., buffer sizes). 3. Re-run benchmarks. Compare results. 4. If improvements are seen, keep the changes. If not, revert. 5. Repeat for other tuning parameters (MTU, congestion control).

Conclusion

Network performance tuning is an iterative process. By understanding your network conditions and carefully adjusting parameters like TCP buffer sizes, MTU, and congestion control algorithms, you can significantly enhance your server's responsiveness and throughput. Always benchmark before and after making changes to quantify the impact and ensure stability.