Table of contents of the article:
TCP / IP networking is a complex subject and it gets really complicated when you try to define performance problems or fix a problem. It helps to have tools that can probe your system and confirm your suspicions or, better yet, that there are no problems.
One such tool is open source hyperf3. Here is his description from GitHub:
iperf is a tool for the active measurement of the maximum bandwidth reachable on IP networks. It supports the optimization of various parameters related to times, protocols and buffers. Each test reports the measured throughput / bitrate, loss and other parameters.
This article shows you how:
- Investigate bandwidth issues between two endpoints with iperf 3
- Test User Datagram Protocol (UDP) multicast connectivity (used by Precision Time Protocol and other protocols for time synchronization)
- Find out about cyclic redundancy check (CRC) errors on a network interface
- Use ethtool and tcpdump to confirm that a faulty network interface or cable is interrupting traffic
- Write more complex scripts using Python 3
I will also briefly explain CPU affinity and why it might be important for iperf3.
Start with iperf3
To follow this tutorial, you will need:
- A Linux distribution (I ran my examples on a Fedora server)
- The ability to run commands as root (using sudo , for example)
- An understanding of basis of network principles
Run the command to install iperf3. About Fedora:
$ sudo dnf install -y iperf3
Iperf3 works by running a client and a server that talk to each other. Here are some terms to know before you start using it:
- The throughput measures how many packages arrive at their destinations successfully.
- The bandwidth Network is the maximum data transfer capacity of a network.
- The jitter is the time delay between when a signal is transmitted and when it is received. Good connections have a constant response time.
- TCP is going to Transmission Control Protocol . It is a reliable protocol that ensures packets arrive in the same order they were sent via a hold out of hand.
- UDP it doesn't have a handshake protocol like TCP. It is faster than TCP, but if a packet is lost, it will not be sent back and there is no guarantee that the packets will arrive in the order sent.
In the demonstration in this article:
- The client and server connect to the wired Ethernet interface. (I won't use wireless interfaces as they are more prone to jitter from outside noise.)
- My test uses the default settings (port, TCP connection unless overridden with the flag
--udp
on the client).
The proof confirms whether:
- The switch between the two machines supports connections at 1.000 Mbit / sec and the interfaces have been configured to that capacity.
- The full-duplex mode is enabled to send and receive data on the card at the same time. You will confirm this later in the article with another tool called ethtool.
Without further ado, I'll get started.
Measure bandwidth and jitter
Here are the initial commands on the server:
[server ~]$ sudo ethtool eth0|rg -e 'Speed|Duplex'
Speed: 1000Mb/s
Duplex: Full
[server ~]$ ip --oneline address|rg 192
2: eth0 inet 192.168.1.11/24 brd 192.168.1.255 scope global dynamic eth0\ valid_lft 2090sec preferred_lft 2090sec
[server ~]$ iperf3 --server --bind 192.168.1.11 -affinity 1
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
And now the client:
[client ~]$ sudo ethtool eno1|rg -e 'Speed|Duplex'
Speed: 1000Mb/s
Duplex: Full
[client ~]$ iperf3 --client raspberrypi --bind 192.168.1.28 --affinity 1
Connecting to host raspberrypi, port 5201
[ 5] local 192.168.1.28 port 47609 connected to 192.168.1.11 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 111 MBytes 932 Mbits/sec 0 2.79 MBytes
[ 5] 1.00-2.00 sec 110 MBytes 923 Mbits/sec 0 2.98 MBytes
...
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 1021 MBytes 857 Mbits/sec 0 sender
[ 5] 0.00-9.95 sec 1020 MBytes 860 Mbits/sec receiver
iperf Done.
I analyze the results:
- Zero attempts (Retr column). This is good and expected.
- The bitrate is around 860 Mbit / sec. The link speed is close to the theoretical bandwidth. Switches have a limit on the amount of traffic the backplane can handle.
- TCP guarantees packet transmission losses, so jitter is not reported here.
If you reverse the
Test the UDP bandwidth
To test UDP, do the following on the client only:
[client ~]$ iperf3 --client raspberrypi --bind 192.168.1.28 --udp --affinity 1
Connecting to host raspberrypi, port 5201
[ 5] local 192.168.1.28 port 47985 connected to 192.168.1.11 port 5201
[ ID] Interval Transfer Bitrate Total Datagrams
[ 5] 0.00-1.00 sec 129 KBytes 1.05 Mbits/sec 91
[ 5] 1.00-2.00 sec 127 KBytes 1.04 Mbits/sec 90
[ 5] 2.00-3.00 sec 129 KBytes 1.05 Mbits/sec 91
...
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-10.00 sec 1.25 MBytes 1.05 Mbits/sec 0.000 ms 0/906 (0%) sender
[ 5] 0.00-9.99 sec 1.25 MBytes 1.05 Mbits/sec 0.028 ms 0/906 (0%) receiver
Here are the results:
- The bitrate is much closer to the theoretical bandwidth. Plus, there's no packet loss, which is great.
- UDP does not guarantee packet loss, so lost datagrams and jitter are reported (and have good values).
You may be wondering what that is --affinity
flag. It's not really necessary here to test bandwidth on this simple example, but it gives me an excuse to talk about affinity.
Fast deviation: CPU affinity, NUMA, isolcpus
If you were curious and checked out the iperf documentation and examples, you probably saw references to the affinity of the CPU or processor .
So what is it? From Wikipedia:
Processor affinity, or CPU lock or "cache affinity," allows a process or thread to be associated and unassociated with a central processing unit (CPU) or range of CPUs , so that the process or thread runs only on the designated CPU or CPU instead of any CPU.
Why would you want to add a process to a specific CPU group?
No CPU-locked instance can use the CPUs of another frozen instance, thus preventing resource contention between instances. Non-uniform memory access (NUMA) allows multiple CPUs to share L1, L2, and L3 caches and main memory.
You can use NUMA hardware to make sure you always use the memory closest to the CPU.
What does a server with multiple NUMA nodes look like? You can find out with lscpu| rg NUMA
:
[client ~]$ lscpu|rg NUMA
NUMA node(s): 2
NUMA node0 CPU(s): 0-7
NUMA node1 CPU(s): 8-15
This is a 16 CPU server with two NUMA nodes (this is a simplified example, a machine with HyperThreading enabled looks different. Depending on the application, you may decide to disable it.
Remember that you can use CPU affinity not only to increase network performance but also the performance of the disk.
Going back to iperf3, you can add it to a specific CPU using -A
o --affinity
. For example, CPU 3 (numbered 0 to n-1) looks like this:
# Equivalent of running iperf3 with numactl: /bin/numactl --physcpubind=2 iperf3 -c remotehost
iperf3 --affinity 2 --client remotehost
Remember that you may also need to tell the operating system to avoid running host processes on these CPUs, so if you use Grubby, you can do this with isolcpus :
# Find the default kernel
$ sudo grubby --default-kernel
# Use that information and add isolcpus parameter, then reboot
$ sudo grubby --update-kernel=/boot/vmlinuz-5.14.18-100.fc33.x86_64 --args="isolcpus=2"
sudo shutdown -r now 'Updated kernel isolcpus, need to reboot'
Again, this isn't necessary to troubleshoot a network problem, but it can come in handy if you want to make iperf3 behave like one of your optimized applications.
Optimization is a complex topic, then grab a cup of coffee (or two) and get ready to start reading.
Use iperf3 to detect lost packets and CRC errors
Un CRC error it is caused by a faulty physical device (network card, switch port, cable) or by a mismatch in the full and half duplex configurations between two devices. These are sometimes difficult to track on switches with cut-through mode, where the switch forwards received errors to all ports.
This is a simplified scenario to ensure that a new NIC connection works without CRC or received / transmitted (Rx / Tx) errors (meaning the switch card, cable and port are OK).
With that in mind, you could do a simple test to make sure the link status is good:
- Capture CRC status and packet dropped errors on the NIC under test.
- Run iperf3 in TCP mode for a longer time than usual.
- Retrieves the CRC statistics of the network card.
If the difference is greater than zero, then:
- Check the full-duplex mode on both the card and the switch port (ethtool).
- Replace the cable.
- Reseat or replace the network card.
- Change the port on the switch.
Get the image; iperf3 will help "burn" the link and trigger any unwanted behavior before using this interface in production.
Here is the process in action. Suppose we take the first snapshot on the iperf3 server:
[server ~]$ sudo ethtool --statistics eth0| rg -i -e 'dropped|error'
rx_errors: 0
tx_errors: 0
rx_dropped: 0
tx_dropped: 0
rxq0_errors: 0
rxq0_dropped: 0
rxq1_errors: 0
rxq1_dropped: 0
rxq2_errors: 0
rxq2_dropped: 0
rxq3_errors: 0
rxq3_dropped: 0
rxq16_errors: 0
rxq16_dropped: 0
Who does he client:
[client ~]$ sudo ethtool --statistics eno1| rg -i -e 'dropped|errors'
tx_errors: 0
rx_errors: 0
align_errors: 0
Run the iperf3
server:
[server ~]$ iperf3 --server --bind 192.168.1.11
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Run iperf3
on the client for 120 seconds:
[client ~]$ iperf3 --client raspberrypi --bind 192.168.1.28 --time 120
Connecting to host raspberrypi, port 5201
[ 5] local 192.168.1.28 port 41337 connected to 192.168.1.11 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 111 MBytes 934 Mbits/sec 0 2.94 MBytes
[ 5] 1.00-2.00 sec 111 MBytes 933 Mbits/sec 0 2.95 MBytes
[ 5] 2.00-3.00 sec 111 MBytes 933 Mbits/sec 0 2.95 MBytes
...
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-120.00 sec 11.0 GBytes 787 Mbits/sec 0 sender
[ 5] 0.00-119.70 sec 11.0 GBytes 789 Mbits/sec receiver
# Measure again ...
[client ~]$ sudo ethtool --statistics eno1| rg -i -e 'dropped|errors'
tx_errors: 0
rx_errors: 0
align_errors: 0
Now I'll talk about another useful tool for getting network interface statistics, ethtool.
What is ethtool?
How do you explain Wikipedia:
ethtool is the primary means in Linux kernel-based operating systems (mainly Linux and Android) to view and change parameters of network interface controllers (NICs) and associated device driver software from application programs in
running in user space.
Here are a couple of questions for you after you're done checking the page man by ethtool :
- What does the
sudo ethtool -g eno1
command? - And this?
sudo ethtool -s eno1 speed 1000 duplex full autoneg on
The ethtool utility is another tool you should have in your toolset.
Automate iperf3 with Python 3
You may notice that iperf3 has a library that allows you to integrate the tool with other languages, including Python:
[client ~]$ rpm -qil iperf3|rg libiperf
/usr/lib64/libiperf.so.0
/usr/lib64/libiperf.so.0.0.0
/usr/share/man/man3/libiperf.3.gz
There are several shortcuts for Python:
- hyperf3-python has an API to integrate iperf3 with Python, using those bindings.
- Il Python ethtool module it is available but marked deprecated, but I will use it for what this demonstration needs.
I won't cover the API here, but rather direct you to the source code of a Python script that uses iperf3 and ethtool to detect network errors (as I did manually above). You can see it running below. Check the Repository and run the script. You will be amazed how easy it is to automate some tasks with Python.
What can you do next?
Learning never stops, so here are some tips and observations to keep you going:
- Faster data has multiple examples of using iperf with different parameters.
- Please note that isolcpus is considered deprecated and cpuset recommended. Refer to this stack overflow discussion to see how to play with cpuset.
- Now you know how to write your own troubleshooting scripts with iperf3 Python API. You should probably write a server iperf3 which can show results using a web browser (maybe combine it with Fast API ?).