Table of contents of the article:
According to many, today's Internet does not move data as it should. Most of the world's cell phone users experience delays from seconds to minutes; public Wi-Fi at airports and conference venues is often worse.
For example, academic scientific researchers in physics or climatology departments need to exchange petabytes of data per day with global collaborators but find that their carefully designed multi-Gbps infrastructure often delivers only a few Mbps over intercontinental distances.
These problems stem from a design choice made when TCP congestion control was created in the 80s interpretation of packet loss as “congestion”.
This equivalence was true at the time, but it was because of technological limitations, not first principles. As NICs (Network Interface Controllers) have evolved from Mbps to Gbps and memory chips from KB to GB, the relationship between packet loss and congestion has become more tenuous.
Today TCP loss-based congestion control, even with the current best of breed, CUBIC11, is the main cause of these problems.
When bottleneck buffers are large, loss-based congestion control keeps them full, causing bufferbloat.
When bottleneck buffers are small, loss-based congestion control misinterprets the loss as a signal of congestion, leading to low throughput. Solving these problems requires an alternative to loss-based congestion control.
Finding this alternative requires an understanding of where and how network congestion originates.
This is where TCP BBR comes in. It is a TCP congestion control algorithm created for modern internet congestion.
TCP: Bandwidth sharing is important.
TCP tries to balance the need to be fast (fast data transmission) and balanced (sharing bandwidth for multiple users), with much more weight on being balanced.
Most TCP implementations use a backoff algorithm which results in about ½ bandwidth.
In short, if you have an outbound bandwidth of 1 gigabit per second on your server, you can be pretty sure that the outbound traffic for your web server running on TCP will hardly exceed 500 or 600 megabits per second.
This is the main problem with TCP, its use leads to waste (or rather useless) bandwidth.
BBR TCP: the important concepts
you can read the document for more details, but the bottom line is that BBR is a congestion control technology that:
It is designed to respond to actual congestion rather than packet loss. The BBR team designed the algorithm with a desire to have something that responds to actual congestion, rather than packet loss. BBR models the network to send at the speed of available bandwidth and is 2700x faster than previous TCP over a 10Gb, 100ms link with 1% loss
Focused on improving network performance when the network is not very good. TCP BBR more accurately balances fairness and usage, resulting in better download speeds on the same network. It is most noticeable in situations where the network is damaged (however, it doesn't hurt if you are on a clean, squeaky network)
It does not require the customer to implement BBR . This is the real magic recipe. Previous algorithms such as QUIC required that client and server both implemented the algorithm. BBR does not require the client to also use BBR. This is particularly relevant in developing countries that use older mobile platforms and have limited bandwidth, or areas where sites and services have not yet made the switch. In short, if your browser does not support QUIC there was no way to improve anything.
Let's take a closer look at BBR on the road.
This all sounds good, but let's see how good this technology is in practice and not just in theory. To test things out, I set up two VMs in two different regions and ran a quick Iperf test to check their performance:
[4] 0.0-10.0 sec 9.97 GBytes 8.55 Gbits/sec
To simulate the ideal conditions where BBR is useful (high packet loss) we run the following tc command, which simulates packet loss at a certain percentage.
sudo tc qdisc add dev eth0 root netem loss 1.5%
When existing virtual machines connect, we see a significant drop in performance (which we would expect):
[3] 0.0-10.3 sec 1.10 GByte 921 Mbits / sec
So, we activated BBR, on the server side only using the following command (only available on Linux Kernel Major at 4.10):
sysctl -w net.ipv4.tcp_congestion_control = bbr
We performed the same Iperf e
[3] 0.0-10.0 sec 2.90 GBytes 2.49 Gbits/sec
We see almost 2 times a raw connection bandwidth usage just after having set a simple flag on the server.
Where to use TCP BBR? Everywhere.
Basically there is no real reason not to use a technology like TCP BBR and thinking about where it should be used and where not time would be wasted and perhaps even stupid.
Of course, most of the benefit will be seen on client endpoints in areas with bad traffic conditions . So if you switch it between VMs, don't be alarmed if things aren't significantly better. That said, there is no downside to using it, even in areas of good connection.
It must be said that among its non-standard uses that for example the implementation of BBR can help manage a TCP-based volumetric DDOS attack or Slashdotting effects where a website is reached by hundreds of thousands of visitors per minute.
Basically it is just a matter of enabling a Linux Kernel greater than 4.10 and enabling a flag, with a simple command line command.
Why not do it?
https://www.youtube.com/watch?v=4LBmFJXX5KU