Just changing Linux's default congestion control (net.ipv4.tcp_congestion_control) to 'bbr' can make a _huge_ difference in some scenarios, I guess over distances with sporadic packet loss and jitter, and encapsulation.
Over the last year, I was troubleshooting issues with the following connection flow:
client host <-- HTTP --> reverse proxy host <-- HTTP over Wireguard --> service host
On average, I could not get better than 20% theoretical max throughput. Also, connections tended to slow to a crawl over time. I had hacky solutions like forcing connections to close frequently. Finally switching congestion control to 'bbr' gives close to theoretical max throughput and reliable connections.
I don't really understand enough about TCP to understand why it works. The change needed to be made on both sides of Wireguard.
The difference is that BBR does not use loss as a signal of congestion. Most TCP stacks will cut their send windows in half (or otherwise greatly reduce them) at the first sign of loss. So if you're on a lossy VPN, or sending a huge burst at 1Gb/s on a 10Mb/s VPN uplink, TCP will normally see loss, and back way off.
BBR tries to find Bottleneck Bandwidth rate. Eg, the bandwidth of the narrowest or most congested link. It does this by measuring the round trip time, and increasing the transmit rate until the RTT increases. When the RTT increases, the assumption is that a queue is building at the narrowest portion of the path and the increase of RTT is proportional to the queue depth. It then drops rate until the RTT normalizes due to the queue draining. It sends at that rate for a period of time, and then slightly increases the rate to see if RTT increases again (if not, it means that the queuing that saw before was due to competing traffic which has cleared).
I upgraded from a 10Mb/s cable uplink to 1Gb/s symmetrical fiber a few years ago. When I did so, I was ticked that my upload speed on my corp. VPN remained at 5Mb/s or so. When I switched to RACK TCP (or BBR) on FreeBSD, my upload went up by a factor of 8 or so, to about 40Mb/s, which is the limit of the VPN.
You seem quite knowledgeable in this domain. Have you authored any blog posts to expand on this topic? I would welcome the chance to learn more from you.
No, fast retransmit basically does what it says -- retransmits things quicker. However, it is orthogonal to what the congestion control (CC) algorithm decides to do with the send window in the face of loss. Older CC like Reno halves the send window. Newer ones like CUBIC are more aggressive, and cut the window less (and grow it faster). However, RACK and BBR are still superior in the face of a lossy link.
Depending on the particular situation maybe vegas would work as well?
In particular, since Wireguard is UDP, using vegas over Wireguard seems to me like it should be good (based on a very limited understanding, though :/ ), it is just a question of how well it would work on the other side of the reverse proxy since I don't think it can be set per link?
Er, I was confused; of course being over UDP won't make the kind of difference I was thinking since the congestion control is just about when packets are sent. Although I heard a while back that UDP packets can be dropped more quickly during congestion. If that is the case and the congestion isn't too severe (but leading to dropped packets because it is over UDP) then possibly vegas would help.
Yes, BBRv1 has fairness issues when used at scale vs certain other algorithms. No, that doesn't mean people finding it tunes their performance 5x in a particular use case with high latency and some loss should stop talking about how it helps in that scenario. YouTube ran it without the internet burning to the ground so using it in these niche kinds of cases with personal tuning almost certainly results in a net good even though the algorithm wasn't perfect. BBRv3 will make it scale to everyone with better fairness for sure but BBR is still much better behaving for fairness than almost any UDP stream anyways.
The original cargo cultists built runways on islands to cause supplies to be dropped off. It didn't work. If someone copy and pasted something they don't fully understand off the Internet, but it works, can you really blame them for it, or call them cargo cultists?
It's easy to coax BBR into converging on using 20% of a shared link instead of 50% (cohabiting with one other stream).
The inverse is true and it's easy to get BBR to hog 80% of a link instead of 50% (cohabiting with one other stream). If you're happy for other people to steal bandwidth from you with greedy CCAs then go ahead and ratelimit yourself. I'm not.
It's still useful when dealing with high-latency links with non-zero loss. Fine, something might outcompete it, but without it the throughput would suck anyway.
E.g. if a service runs on a single server (no CDN) and you occasionally get users from Australia then the site will by a bit laggy for them but at least it won't be a trickle.
> The change needed to be made on both sides of Wireguard.
Congestion control works from the sender of data to the receiver. You don't need to switch both sides if you are just interested in improving performance in one direction.
Besides that, I agree to what others said BBRv1. The cubic implementation in the Linux kernel works really nice for most applications.
Over the last year, I was troubleshooting issues with the following connection flow:
client host <-- HTTP --> reverse proxy host <-- HTTP over Wireguard --> service host
On average, I could not get better than 20% theoretical max throughput. Also, connections tended to slow to a crawl over time. I had hacky solutions like forcing connections to close frequently. Finally switching congestion control to 'bbr' gives close to theoretical max throughput and reliable connections.
I don't really understand enough about TCP to understand why it works. The change needed to be made on both sides of Wireguard.