Top Banner
BBRv2+: Towards Balancing Aggressiveness and Fairness with Delay-based Bandwidth Probing Furong Yang a,b,c , Qinghua Wu a , Zhenyu Li a,* , Yanmei Liu d , Giovanni Pau e,f , Gaogang Xie g a Institute of Computing Technology, Chinese Academy of Sciences, China b University of Chinese Academy of Sciences, China c Sorbonne University, France d Alibaba Group, China e University of Bologna, Italy f University of California, Los Angeles, USA g Computer Network Information Center, Chinese Academy of Sciences, China Abstract BBRv2, proposed by Google, aims at addressing BBR’s shortcomings of unfairness against loss-based conges- tion control algorithms (CCAs) and excessive retransmissions in shallow-buffered networks. In this paper, we first comprehensively study BBRv2’s performance under various network conditions and show that BBRv2 mitigates the shortcomings of BBR. Nevertheless, BBRv2’s benefits come with several costs, including the slow responsiveness to bandwidth dynamics as well as the low resilience to random losses. We then propose BBRv2+ to address BBRv2’s performance issues without sacrificing its advantages over BBR. To this end, BBRv2+ incorporates delay information into its path model, which cautiously guides the aggressiveness of its bandwidth probing to not reduce its fairness against loss-based CCAs. BBRv2+ also integrates mecha- nisms for improved resilience to random losses as well as network jitters. Extensive experiments demonstrate the effectiveness of BBRv2+. Especially, it achieves 25% higher throughput and comparable queuing delay in comparison with BBRv2 in high-mobility network scenarios. Keywords: Congestion Control, BBR, BBRv2 1. Introduction Congestion control has been one of the active re- search topics in computer networks since it was in- troduced in the 1980s [1]. More than three decades of research on congestion control have brought us a plethora of congestion control algorithms (CCAs) and TCP variants, aiming at efficient utilization of available bandwidth while fairly sharing the bot- tleneck bandwidth among multiple flows. For in- stance, Linux kernel alone has more than 15 differ- ent CCAs [2]. While we see many recent proposals on learning-based CCAs (e.g. Remy [3], Aurora [4], * Corresponding author Email addresses: [email protected] (Furong Yang), [email protected] (Qinghua Wu), [email protected] (Zhenyu Li), [email protected] (Yanmei Liu), [email protected] (Giovanni Pau), [email protected] (Gaogang Xie) PCC-Vivace [5], Indigo [6], Orca [2]), the wildly de- ployed CCAs today are still classic ones (e.g. Cu- bic [7], BBR [8]). In this paper, we focus on BBR and its up- grade, BBRv2, as BBR has been used by 22% of the Alexa Top 20K websites [9] and BBRv2 will likely replace BBR in the near future 1 . BBR is a rate-based CCA that sets its sending rate based on the measured bottleneck bandwidth (BtlBW ) and round trip propagation time (RTprop ). That said, instead of reacting to congestion signals such as losses or delay dynamics, BBR tries to actively operate at Kleinrock’s optimal operating point [11] to maximize throughput without incurring standing queues at a bottleneck link. The previous empirical studies [12–17] have disclosed several shortcomings 1 As of March 2021, Google has finished roll-out of BBRv2 for internal TCP traffic, and tuning performance to enable roll-out for external traffic [10]. Preprint submitted to Computer Networks July 8, 2021 arXiv:2107.03057v1 [cs.NI] 7 Jul 2021
19

BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

Apr 15, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

BBRv2+: Towards Balancing Aggressiveness and Fairness withDelay-based Bandwidth Probing

Furong Yanga,b,c, Qinghua Wua, Zhenyu Lia,∗, Yanmei Liud, Giovanni Paue,f, Gaogang Xieg

aInstitute of Computing Technology, Chinese Academy of Sciences, ChinabUniversity of Chinese Academy of Sciences, China

cSorbonne University, FrancedAlibaba Group, China

eUniversity of Bologna, ItalyfUniversity of California, Los Angeles, USA

gComputer Network Information Center, Chinese Academy of Sciences, China

Abstract

BBRv2, proposed by Google, aims at addressing BBR’s shortcomings of unfairness against loss-based conges-tion control algorithms (CCAs) and excessive retransmissions in shallow-buffered networks. In this paper, wefirst comprehensively study BBRv2’s performance under various network conditions and show that BBRv2mitigates the shortcomings of BBR. Nevertheless, BBRv2’s benefits come with several costs, including theslow responsiveness to bandwidth dynamics as well as the low resilience to random losses. We then proposeBBRv2+ to address BBRv2’s performance issues without sacrificing its advantages over BBR. To this end,BBRv2+ incorporates delay information into its path model, which cautiously guides the aggressiveness ofits bandwidth probing to not reduce its fairness against loss-based CCAs. BBRv2+ also integrates mecha-nisms for improved resilience to random losses as well as network jitters. Extensive experiments demonstratethe effectiveness of BBRv2+. Especially, it achieves 25% higher throughput and comparable queuing delayin comparison with BBRv2 in high-mobility network scenarios.

Keywords: Congestion Control, BBR, BBRv2

1. Introduction

Congestion control has been one of the active re-search topics in computer networks since it was in-troduced in the 1980s [1]. More than three decadesof research on congestion control have brought usa plethora of congestion control algorithms (CCAs)and TCP variants, aiming at efficient utilization ofavailable bandwidth while fairly sharing the bot-tleneck bandwidth among multiple flows. For in-stance, Linux kernel alone has more than 15 differ-ent CCAs [2]. While we see many recent proposalson learning-based CCAs (e.g. Remy [3], Aurora [4],

∗Corresponding authorEmail addresses: [email protected] (Furong

Yang), [email protected] (Qinghua Wu),[email protected] (Zhenyu Li), [email protected](Yanmei Liu), [email protected] (Giovanni Pau),[email protected] (Gaogang Xie)

PCC-Vivace [5], Indigo [6], Orca [2]), the wildly de-ployed CCAs today are still classic ones (e.g. Cu-bic [7], BBR [8]).

In this paper, we focus on BBR and its up-grade, BBRv2, as BBR has been used by 22% ofthe Alexa Top 20K websites [9] and BBRv2 willlikely replace BBR in the near future1. BBR isa rate-based CCA that sets its sending rate basedon the measured bottleneck bandwidth (BtlBW )and round trip propagation time (RTprop). Thatsaid, instead of reacting to congestion signals suchas losses or delay dynamics, BBR tries to activelyoperate at Kleinrock’s optimal operating point [11]to maximize throughput without incurring standingqueues at a bottleneck link. The previous empiricalstudies [12–17] have disclosed several shortcomings

1As of March 2021, Google has finished roll-out of BBRv2for internal TCP traffic, and tuning performance to enableroll-out for external traffic [10].

Preprint submitted to Computer Networks July 8, 2021

arX

iv:2

107.

0305

7v1

[cs

.NI]

7 J

ul 2

021

Page 2: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

of BBR: (1) it causes excessive retransmissions inshallow-buffered networks; (2) it is not fair whencompeting with flows using loss-based CCAs (e.g.Cubic) and BBR flows with different Round TripTimes (RTTs); (3) its performance degrades whennetwork jitters are high.

To address these issues in BBR, Google proposedBBRv2 [18] that inherits most of the design princi-ples from BBR, while reacts to losses and ExplicitCongestion Notification (ECN) marks for better co-existence with loss-based CCAs and being less ag-gressive in shallow-buffered networks. Given thatBBRv2 may eventually replace BBR, understand-ing how BBRv2 actually performs is of great impor-tance for improved performance and fairness. Wehave also seen several studies [19–23] on measur-ing BBRv2, which have shown that, in comparisonwith BBR, BBRv2 improves the inter-protocol fair-ness against loss-based CCAs and reduce retrans-missions in shallow-buffered networks. Neverthe-less, we found in this paper, these improvementscome with several costs, including the low resilienceto random packet losses and the slow responsivenessto bandwidth dynamics.

In this paper, we first evaluate BBRv2 in variousnetwork conditions with an emphasis on the reasonsbehind the observed performance issues. Our keyobservations from the empirical study of BBRv2 areas follows:

• Due to its conservative strategies in bandwidthprobing and inflight cap estimation, BBRv2achieves better inter-protocol fairness againstloss-based CCAs in shallow-buffered networksthan BBR. On the other hand, these strate-gies also make BBRv2 slightly less competitivethan BBR in terms of throughput under moder-ate buffers. BBRv2 also improves RTT fairnessamong flows, compared with BBR.

• In shallow-buffered networks, retransmissionsof BBRv2 are significantly reduced comparedwith that of BBR. However, the throughput ofBBRv2 is 13%∼16% lower than that of BBRunder shallow buffers, as BBRv2 limits its in-flight size to about 0.85× BDP for most of thetime in these networks.

• BBRv2 is less resilient to random losses thanBBR. Interestingly, we find that carefully tun-ing the loss threshold parameter in BBRv2 ac-cording to bottleneck buffer sizes can enhanceBBRv2’s loss resilience without sacrificing itsadvantages in retransmission and fairness.

• BBRv2 is less responsive to bandwidth dynam-ics than BBR, which leads to low bandwidthutilization and high queuing delay in networkswith bandwidth dynamics. The long bandwidthprobing interval and the long expiry time of bot-tleneck bandwidth estimation are the two majorcontributors.

• Like BBR, BBRv2’s performance still suffersfrom congestion window (cwnd) exhaustion inhigh-jitter networks that are not rare in wire-less scenarios [16, 24–27].

The results of our empirical study of BBRv2raise one question to us: are we able to improveBBRv2’s performance while keeping its advantagesin retransmission and fairness? If so, how do weachieve this goal?

Compared with BBR, BBRv2’s shortcomings liein the lower loss resilience and slower responsivenessto bandwidth dynamics. On one hand, the issueregarding loss resilience can be mitigated by care-fully tuning the loss threshold parameter in BBRv2.On the other hand, we can increase the aggressive-ness of BBRv2 in bandwidth probing to improveits responsiveness to bandwidth dynamics. But,this aggressiveness needs to be cautiously guided,as blindly behaving aggressively may cause unfair-ness against loss-based CCAs like BBR. Currently,the aggressiveness of bandwidth probing in BBRand BBRv2 is either hard-coded or pre-configuredaccording to designers’ experience without the per-ception of the network environment. This kindof unguided aggressiveness may make the band-width probing be either over-aggressive (like BBR)or over-conservative (like BBRv2) in certain envi-ronments. Thus, the feedback from the network en-vironment needs to be considered in BBRv2’s band-width probing strategy to guide the aggressivenessof bandwidth probing.

To address the above gap, we propose BBRv2+.Firstly, BBRv2+ integrates delay information intoits path model, which serves as the feedback toguide its aggressiveness in bandwidth probing. Sec-ondly, to utilize the delay information to guidethe aggressiveness of BBRv2+’s bandwidth prob-ing, the state-machine of BBRv2 is partially re-designed. In doing so, BBRv2+ balances betweenthe aggressiveness in probing for more bandwidthand the fairness against loss-based CCAs. Thirdly,to avoid being suppressed when co-existing withloss-based CCAs in deep-buffered networks becauseof using the delay information, BBRv2+ incorpo-

2

Page 3: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

rates a dual-mode mechanism, where it switches touse BBRv2’s state-machine if no RTT sample ap-proaching RTprop is observed for a long time pe-riod and returns back to use the redesigned state-machine if it constantly observes RTT samples ap-proaching RTprop. Finally, as an optimization,BBRv2+ addresses the cwnd exhaustion problem inhigh-jitter networks by compensating its estimatedBandwidth Delay Product (BDP) according to ob-served jitters; the compensation mechanism allowsthe estimated BDP to be close to the actual BDP,and can also be applied to BBR and BBRv2.

Extensive experiments based on both Mininetand Mahimahi with real-world traces show thatcompared with BBRv2, BBRv2+ succeeds to bal-ance the aggressiveness of bandwidth probing andthe fairness against loss-based CCAs, improvesthe resilience to network jitters, and, particularly,achieves 25% higher throughput and comparablequeuing delay in high-mobility scenarios where thebandwidth is very dynamic.

To summarize, the contributions of this paperare three-fold: (1) a deep dive into BBRv2 thatreveals its pros and cons, compared with BBR;(2) BBRv2+ that addresses the shortcomings ofBBRv2 while barely sacrificing BBRv2’s advan-tages; (3) extensive experiments demonstratingthat BBRv2+ meets its design goals. We open-source BBRv2+ to the research community for fur-ther test and improvement [28].

The remainder of this paper is organized as fol-lows. We first give an overview of BBR and BBRv2in §2. Then, a deep dive into BBRv2, which mo-tivates the design of BBRv2+, is presented in §3.Next, the design and implementation of BBRv2+is described in §4, and the evaluation of BBRv2+is shown in §5. After that, we present the relatedwork in §6. Finally, the paper is concluded in §7.

2. Background: an overview of BBR andBBRv2

BBR aims at maximizing throughput while keep-ing the lowest latency; it requires accurate measure-ments of both BtlBW and RTprop. Since these twovariables cannot be measured simultaneously, BBRintroduces a state-machine-based method that al-ternatively estimates BtlBW and RTprop.

As illustrated in Fig. 1, there are four states inthe BBR life-cycle. BBR uses pacing gain to con-trol the sending behavior—to probe for more band-width, to drain the queue at the bottleneck link, or

Fig. 1: Illustration of BBR life-cycle

Fig. 2: Illustration of BBRv2 life-cycle

to cruise at the speed of BtlBW. Firstly, BBR startsthe Startup state which exponentially increases theinflight size and sending rate by setting pacing gainto 2/ln(2). BBR transits into the Drain state whenit is in the plateau of BtlBW for three RTTs. In theDrain state, BBR reduces pacing gain to ln(2)/2 todrain the standing queue at the bottleneck link in-duced during Startup. After the above two stages,BBR has successfully built the path model, withRTprop measured at the beginning of Startup andBtlBW measured at the end of Startup. After that,BBR switches to a steady phase where BBR al-ternatively runs in the ProbeBW and ProbeRTTstate. During the ProbeBW state, BBR sets pac-ing gain to 1.0 to cruise at Kleinrock’s optimalpoint for 6 cycles, then sets pacing gain to 1.25 toexplore more bandwidth for 1 cycle and thereaftersets pacing gain to 0.75 to drain the possible stand-ing queue for 1 cycle. In the ProbeRTT state, BBRreduces its inflight size to 4× MSS (Max SegmentSize) and waits for max{RTT, 200ms} to measurean updated value of RTprop.

As observed in numerous previous studies [12–16], BBR has two key issues: the unfairness of band-width share with loss-based CCAs and the highretransmission rate in shallow-buffered networks.

3

Page 4: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

The reasons behind these issues are that BBR iscongestion signal agnostic and is over-aggressivewhen probing for more bandwidth. To mitigate theproblems above, Google proposed BBRv2, whichinherits most of BBR’s design (e.g. the core princi-ple, the overall building blocks, etc.) yet redesignsthe ProbeBW state, as illustrated in Fig. 2.

BBRv2 adds measurements of packet loss andDCTCP-style ECN marks [29] for estimating thecapacity of a bottleneck link. Specifically, it intro-duces inflight lo and bw lo as the short-term lowerbounds of inflight size and sending rate respectively,in order to capture the temporary status of the net-work path (e.g. cross-traffic takes a share of ca-pacity); it uses inflight hi as the long-term upperbound of inflight size to reduce the likelihood ofpacket loss. To avoid recklessly probing for morebandwidth, BBRv2 decomposes BBR’s ProbeBWstate into four sub-states: ProbeCruise, ProbeRe-fill, ProbeUp, and ProbeDown.ProbeCruise: In ProbeCruise, BBRv2 sets pac-ing gain to 1. If any loss or ECN mark occurs,BBRv2 updates inflight lo and bw lo to max{(1 −β) × inflight lo,BtlBWcurr} and max{(1 − β) ×bw lo, inflightcurr} respectively, where BtlBWcurr

and inflightcurr are the current measurements ofbandwidth and inflight size.ProbeRefill: When BBRv2 has been in Probe-Cruise for a period of T (T is determined inProbeDown), BBRv2 transits to ProbeRefill, bysetting inflight lo and bw lo to +∞ to refill the“pipe” with BDP-sized inflight data, which lastsfor one RTT. The goal of this state is to avoid earlylosses before the capacity is fully utilized in shallow-buffered networks since BBRv2 will accelerate inthe following ProbeUp state, which may lead thebottleneck buffer to overflow.ProbeUp: During ProbeUp, BBRv2 sets pac-ing gain to 1.25 to probe for more available band-width. This state ends either when the currentloss rate exceeds a pre-defined explicit loss thresh-old (2%), (or the ECN mark rate exceeds an ECNthreshold), or when the inflight size reaches 1.25×BDP and at least one RTprop has passed. In theformer case, BBRv2 sets inflight hi to the currentinflight size.ProbeDown: During ProbeDown, BBRv2 drainsthe potential queue at the bottleneck link bysetting pacing gain to 0.75. BBRv2 also setsthe duration (T ) for the next ProbeCruise stateto min{rand(2, 3), BDP

MSS × RTT} seconds, whererand(2, 3) means a number between two and three.

The intention of T is to match the interval betweenloss recovery epochs of Reno for TCP fairness. TheProbeDown state ends when BBRv2 cuts its inflightsize below the minimum value between 1× BDP and0.85 × inflight hi. Thereafter, BBRv2 transits tothe next ProbeCruise state.

As the bandwidth probing behaviors of BBRv2are different from BBR, BBRv2 no longer uses a10-RTT-windowed max filter to track the estima-tion of BtlBW, and it rather takes the maximumbandwidth measured in the recent two ProbeBWstages as the estimation of BtlBW, which ensuresthat the bandwidth samples from ProbeUp statesare considered.

Summary of BBRv2: The inflight bound mech-anism, driven by losses or ECN marks, and theless aggressive bandwidth probing strategy makeBBRv2 more conservative than BBR in shallow-buffered networks, which thus mitigates the prob-lems regarding excessive retransmissions and un-fairness. However, the two changes can potentiallyreduce BBRv2’s performance under random lossesand bandwidth dynamics, because, compared withBBR, BBRv2 probes for bandwidth less frequently,takes more time to expire BtlBW estimations, andslows down constantly if the random loss rate ex-ceeds 2%.

3. A Deep Dive into BBRv2

In this section, we conduct extensive measure-ments to investigate the improvements and over-heads of BBRv2, in comparison with BBR. Ourkey observations include: (1) BBRv2 improves theinter-protocol fairness and RTT fairness, and alsoreduces retransmissions in shallow-buffered net-works; this observation reaffirms those in the pre-vious studies [19–23]; (2) the improvements ofBBRv2 come with the cost of the low resilience torandom loss and the slow responsiveness to band-width dynamics. That said, it fails to achievea balance between the aggressiveness in probingfor more bandwidth and the fairness against loss-based CCAs; (3) like BBR, BBRv2 experiences lowthroughput in high-jitter networks because of un-derestimation of BDP.

3.1. Methodology

We utilize Mininet [30] to build an emulation-based testbed. The testbed, whose topology isshown in Fig. 3, was run on a server with 8 Intel

4

Page 5: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

Xeon Platinum cores and 32GB of memory. Theoperating system is Ubuntu 18.04.5 with BBR andBBRv2 [31] installed. Linux tc-netem [32] is usedto emulate different network conditions (e.g. routerbuffer size, link speed, RTT, random loss rate, jit-ter). Iperf3 [33] generates TCP traffic betweensenders and receivers. During the transmission,various performance metrics (e.g. RTT, through-put, retransmissions, inflight bytes) are measuredby tcpdump [34] and tcptrace [35]. Moreover, a setof internal variables (e.g. cwnd, pacing rate, RT-prop, BtlBW ) in BBR and BBRv2 are reported byLinux kernel module and the backlog information ofthe standing queue in bottleneck routers (R2 andR3) is reported by tc [36]. Each set of experimentsis repeated five times and the average results arereported.

Fig. 3: Mininet testbed

3.2. Fairness

We first evaluate the fairness of BBRv2. Twotypes of fairness are investigated: the inter-protocolfairness against loss-based CCAs and RTT fairness.In this set of experiments, two flows start simulta-neously, one from H1 to H3 and the other from H2to H4, and last for three minutes for the conver-gence of throughput. The bottleneck bandwidth isfixed at 40Mbps without network jitters or randomlosses.

We use Jain’s fairness index [37] (F) of the twoflows as the metrics of fairness, calculated accordingto Eq. 1, where Ti is the average throughput of thei-th flow.

F =(T1 + T2)2

2 ∗ (T 21 + T 2

2 )(1)

F = 1 indicates the maximum fairness where twoflows have the same average throughput, and F =0.5 represents that one flow’s throughput is zeroand the fairness is minimized. As the size of thebottleneck buffer also impacts fairness results, wevaried the buffer size to study their relationship.

5075

100

(%)

0.2 0.5 1 2 3 4 8 16 32Buffer Size (BDP)

010203040

Tput

. (M

bps)

BBRCubic

(a) BBR vs Cubic

5075

100

(%)

0.2 0.5 1 2 3 4 8 16 32Buffer Size (BDP)

010203040

Tput

. (M

bps)

BBRv2Cubic

(b) BBRv2 vs Cubic

Fig. 4: Inter-protocol fairness of BBR/BBRv2 underdifferent buffer sizes.

3.2.1. Inter-protocol fairness

In the experiment, the flow from H1 to H3 useseither BBR or BBRv2, and that from H2 to H4uses Cubic, which is the default CCA in Linux andMacOS. The RTTs of the two paths of the two flowsare set to 40ms.

Fig. 4 shows the inter-protocol fairness resultsfor both BBR and BBRv2. We can observe thatcompared with BBR, BBRv2 significantly improvesJain’s fairness index when the buffer is shallow (i.e.,less than 2× BDP). This is due to the fact thatBBRv2 reacts to losses caused by buffer overflowand bounds the inflight size by using inflight hi andinflight lo. When the bottleneck buffer becomesdeeper, Cubic obtains more bandwidth than BBRor BBRv2. This is because that the inflight size ofboth BBR and BBRv2 is limited by about 2× BDP,while Cubic’s inflight size can go beyond this valueunder deep buffers. As the two flows experiencesimilar RTTs, a larger inflight size means higherthroughput. We also note that BBRv2 is less com-petitive than BBR under moderate buffers. Thereason lies in that BBRv2 is more conservative inbandwidth probing and inflight cap estimation.

3.2.2. RTT fairness

In this experiment, both flows use the same CCA,either BBR or BBRv2. The path between H1 andH3 has an RTT of 40ms and that between H2 andH4 has an RTT of 150ms. Fig. 5 shows the RTTfairness results of BBR and BBRv2, where we setthe buffer size to x times of the BDP of the pathbetween H2 and H4.

When the buffer is quite shallow (i.e., 0.2× BDP),BBR has good fairness. When the buffer size be-comes larger, the BBR flow with longer RTT grad-ually occupies all the bandwidth and starves theflow with shorter RTT. The reason for the poorRTT fairness of BBR is well documented by the

5

Page 6: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

5075

100

(%)

0.2 0.5 1 2 3 4 8 16 32Buffer Size (BDP)

010203040

Tput

. (M

bps)

BBR/40msBBR/150ms

(a) BBR

5075

100

(%)

0.2 0.5 1 2 3 4 8 16 32Buffer Size (BDP)

010203040

Tput

. (M

bps)

BBRv2/40msBBRv2/150ms

(b) BBRv2

Fig. 5: RTT fairness of BBR/BBRv2 under differentbuffer sizes.

previous studies [14, 17]. The bandwidth probingof BBR leads the aggregated sending rate of twoflows to exceed bottleneck bandwidth, thus, form-ing a persistent queue at the bottleneck link. Asthe inflight cap of BBR is proportional to RTprop,the flow with longer RTT pours more data into thebottleneck buffer, thus, leading to a larger shareof the bottleneck link’s capacity. This problem isnot severe under shallow buffers, as excess packetsare mostly dropped instead of forming a persistentqueue.

Compared with BBR, BBRv2 has better RTTfairness, especially under deep buffers. The likelyreason is three-fold. First, when the buffer size ismoderate, losses are triggered due to buffer over-flow, and then both flows reduce their inflight sizeproportional to BDP. As the flow with longer RTThas a larger BDP, it reduces its inflight size morethan the flow with shorter RTT. Second, whenBBRv2 flows are cruising at the speed of BtlBW,they always try to leave headroom2 for other flowsto explore the bandwidth. Third, BBRv2 enters theProbeRTT state more often than BBR, thus, lead-ing BBRv2 flows to yield occupied capacity morefrequently.

3.3. Retransmission and throughput

As one of BBRv2’s design goals is to reduce un-necessary retransmissions in shallow-buffered net-works, next we investigate whether BBRv2 achievesthis design goal. The experimental setup is sim-ilar to that in the previous work [12], where thebottleneck bandwidth varies in 10∼750 Mbps andthe path RTT varies in 5∼150 ms as these valuesare commonly employed in modern networks [12,

2BBRv2 always limits its inflight size below 0.85× in-flight hi to leave headroom for faster throughput conver-gence with other flows if there is any.

5 10 25 50 75 100

150

RTT ms ->

750500250100

502010

BW M

bps -

>

0

2

4

6

8

Retx

Rat

e %

(a) BBR

5 10 25 50 75 100

150

RTT ms ->

750500250100

502010

BW M

bps -

>

0

2

4

6

8

Retx

Rat

e %

(b) BBRv2

Fig. 6: The heatmap of the retransmission rate ofBBR/BBRv2 under various network conditions. Thenumbers in squares are retransmission rates in percent-age.

5 10 25 50 75 100

150

RTT ms ->

750500250100

502010

BW M

bps -

>

20

10

0

10

20

Tput

Gai

n %

(a) 100KB buffer

5 10 25 50 75 100

150

RTT ms ->

750500250100

502010

BW M

bps -

>

20

10

0

10

20

Tput

Gai

n %

(b) 10MB buffer

Fig. 7: The heatmap of Tput Gain (in percentage) in(a) shallow and (b) deep buffered networks.

14, 38]. The buffer size at the bottleneck link isset to 100KB to emulate a shallow-buffered net-work because 100KB is less than the BDP of mostbandwidth-RTT combinations in our setup. OneTCP flow from H1 to H3 runs for 30 seconds andthe retransmission rate is recorded for each setup.

The heatmaps in Fig. 6 show the retransmissionrates of BBR and BBRv2 under various networkconditions. We can observe that the retransmissionrate of BBRv2 is significantly reduced comparedwith that of BBR, especially when the BDP is largerthan 400KB (i.e. the buffer size ≤ 0.25× BDP).

The lower retransmission rate of BBRv2 inshallow-buffered networks stems from the fact thatBBRv2 reacts to packet losses, while BBR does not.In the Startup and ProbeUp state, BBRv2 tries tosend at a rate higher than the bottleneck band-width, which leads to excessive losses (i.e. loss rate≥ 2%). The excessive losses trigger inflight hi tobe set to the current inflight size that is likely closeto BDP in shallow-buffered networks. As a result,BBRv2’s inflight size is bounded beblow 0.85× in-flight hi in ProbeCruise because it tries to leaveheadroom for other flows to explore bandwidth.Since BBRv2 flows spend most of their lifecycle in

6

Page 7: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

ProbeCruise, the average throughput of BBRv2 isexpected to be 15% lower than the available band-width.

That said, BBRv2 trades off throughput againstretransmission in shallow-buffered networks. Toverify this, we compute the throughput gain of BBRover BBRv2 (Tput Gain), which is defined in Eq. 2,where TputBBR (resp. TputBBRv2) is the averagethroughput of a BBR (resp. BBRv2) flow over 30seconds.

Tput Gain =TputBBR − TputBBRv2

TputBBRv2(2)

Fig. 7a plots the Tput Gain under various net-work conditions. We can observe that in thenetwork conditions where BBRv2 reduces the re-transmission rate (when the BDP exceeds 400KB),BBRv2 achieves lower throughput than BBR.Specifically, the throughput of BBRv2 is 13%∼16%lower than that of BBR in these cases, which coin-cides with our analysis.

In deep-buffered networks, however, the packetlosses are much less often. It is thus expected thatthe throughput of BBR and BBRv2 are compara-ble. This is confirmed by the results in Fig. 7b,where the buffer size is configured at 10MB, largerthan the BDP of most of the bandwidth-RTT com-binations in our setup. The throughput differencesbetween BBR and BBRv2 are indeed marginal inthese networks.

3.4. Resilience to random losses

Several early tests [20, 22, 23] have shown thatBBRv2 is less resilient to random losses than BBR,since BBRv2 limits its inflight size by the inflight loand inflight hi, which both react to all types oflosses. In BBRv2, there are two parameters thatdecide how the inflight lo and inflight hi react tolosses. One is the explicit loss threshold (α) andthe other one is the inflight lo reduction factor (β).In our experiments, we investigate BBRv2 vari-ants with different α and β under random loss,where each specific BBRv2 variant is referred asBBRv2(α, β). For α, we cap it at 20% to matchthe maximum loss rate that BBR can tolerate; forβ, we only evaluate the difference between the casewith (i.e. β = 0.3) and without it i.e. (β = 0)3.

3The default value 0.3 is necessary for BBRv2 to co-existwith Cubic [31]

10 4 10 3 10 2 10 1 100 101

Random loss rate (%)0

10

20

30

40

Tput

. (M

bps) Cubic

BBRBBRv2BBRv2(5%,0.3)BBRv2(10%,0.3)BBRv2(15%,0.3)BBRv2(20%,0.3)BBRv2(20%,0.0)

Fig. 8: Avg. throughput against different random lossrates (buffer size = 32× BDP).

In the experiment, the bottleneck bandwidth isset to 40Mbps, and the path RTT is 40ms. Thebuffer size is set to 32× BDP to avoid packet lossdue to buffer overflow. The random loss rate rangesfrom 0% to 30%.

Fig. 8 reports the average throughput of eachCCA against random loss rates. We observe thatthe throughput of BBRv2 drops significantly af-ter the random loss rate reaches 2%. There is aclear sign that the α impacts the loss resilience ofBBRv2: as the α increases, we can observe theimprovement of loss resilience of BBRv2. For in-stance, with a 10% random loss rate, BBRv2(20%,0.3) reaches around half of the maximum band-width while BBRv2’s throughput nearly drops tozero. The impact of β is also remarkable: the lossresilience of BBRv2(20%, 0.3) is lower than that ofBBRv2(20%, 0) that performs similar to BBR.

The above results indicate that BBRv2’s loss re-silience can be improved via raising the α. Yet,there is a concern — how does the α impact theretransmission rate in shallow-buffered networks aswe already saw that BBRv2 alleviates the retrans-mission issue by setting inflight hi upon the lossrate exceeding α to lower down its inflight size (see§3.3).

To investigate the aforementioned concern, wefurther extended the experiments by consideringmore BBRv2 variants (α ∈ [2%, 100%], β = 0.3)and more configurations on buffer size (buffer size∈ {0.2, 0.5, 1.0, 1.5, 2.0}× BDP). Fig. 9 plots theretransmission rates of all those BBRv2 variants un-der 0% random loss rate (to eliminate the impact ofrandom losses on retransmission rate), which showsthe impact of buffer size. Two observations are no-table. Firstly, we observe that the retransmissionrate increases when the α exceeds a certain point,which depends on the bottleneck buffer size. The αvalues beyond the turning points are too high to bereached by the temporary loss rate, thus, limiting

7

Page 8: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

0 20 40 60 80 100BBRv2 loss threshold ( ) (%)

0.0

0.2

0.4

0.6

Retx

rate

(%) 0.2xBDP

0.5xBDP1.0xBDP

1.5xBDP2.0xBDP

Fig. 9: Retransmission rates versus loss thresholds (α)under networks with various buffer sizes. The errorbarsin the figure represent the standard deviations of re-transmission rate. Note that the β was fixed at 0.3 forall experiments; the default α in BBRv2 is 2% (the firstdata point of every line).

5075

100

(%)

0.2 0.5 1 2 3 4 8 16 32Buffer Size (BDP)

010203040

Tput

. (M

bps)

BBRv2Cubic

Fig. 10: Inter-protocol fairness of BBRv2(20%, 0.3).

the efficacy of inflight hi. Secondly, if the buffer sizeis large enough (i.e. 2× BDP in our experiments),the retransmissions are eliminated, thus, the valueof α becomes irrelevant.

Another concern about lifting α is the impacton the inter-protocol fairness because the largerα is, the slower reaction of BBRv2 to losses is,which makes BBRv2 more aggressive to loss-basedCCAs. To investigate this concern, we test theinter-protocol fairness of BBRv2(20%, 0.3) usingthe same setup in §3.2.1, and plot the results inFig. 10. In comparison with Fig. 4b, we can seethat the inter-protocol fairness of BBRv2 is indeedworsened in the case of extremely shallow buffer(0.2× BDP) due to the increased aggressivenesscaused by a larger α. Nevertheless, we also observethat the fairness index is improved under moder-ate buffers because the increased aggressiveness alsomakes BBRv2 less vulnerable to Cubic when thebottleneck buffer becomes larger.

Summary of random loss resilience: The lossresilience of BBRv2 can be improved by raisingthe loss threshold α. Nevertheless, the thresholdα should be carefully tuned according to the bot-tleneck buffer size to avoid increasing retransmis-

sions and being too aggressive to loss-based CCAsin extremely shallow-buffered networks.

3.5. Responsiveness to bandwidth dynamics

In networks with highly dynamic available band-width [27, 39, 40], BBRv2’s bandwidth probing mayfail to quickly adapt to bandwidth changes. Next,we investigate BBRv2’s responsiveness to band-width changes.

The experiments are designed as follows. Thebandwidth of the bottleneck link is configured toincrease or decrease 5Mbps every 2 seconds, thepath delay is set to 40ms, and the buffer size isset to 32× BDP. The internal variables during flowtransmission (including pacing rate and BtlBW, in-stantaneous throughput, and the queue length atthe bottleneck link) are sampled at an interval of100ms.

Fig. 11 shows how BBR and BBRv2 adaptto bandwidth increases or decreases respectively.In this figure, the upward and downward spikesof queue length correspond to the actions ofBBR/BBRv2 in probing for more bandwidth ordraining the bottleneck buffer. We can observe thatBBRv2 is less effective than BBR in terms of re-sponsiveness to bandwidth dynamics, resulting inlow utilization of bandwidth and long queuing de-lay.

As we discussed in §2, to match the in-terval between Reno loss recovery epochs forbetter inter-protocol fairness, BBRv2 usesmin{rand(2, 3), BDP

MSS × RTT} seconds as itsprobing interval. This interval can be tens ofRTTs, which is too conservative in such a dy-namic environment. That said, BBRv2 improvesthe inter-protocol fairness, at the cost of poorerresponsiveness to bandwidth dynamics.

3.6. Resilience to network jitters

Several works [16, 27] have shown that through-put collapse occurs when BBR operates in high-jitter networks that are widely deployed, e.g. WiFiand 5G networks operating in mmWave band [16,24, 25], and cellular networks [16, 26] especiallywhen high-mobility involves such as high-speedrails [27]. It is interesting to investigate whetherBBRv2 operates well in networks with high jitters.

In this experiment, the bottleneck bandwidth is40Mbps and the path RTT is 40ms. The bottleneckbuffer size is set to 32× BDP to avoid buffer over-flow. To emulate jitters, tc is used to add jitters fol-lowing Gaussian distribution at R3’s interface that

8

Page 9: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

2 4 6 8 10 12 14 16Time (s)

0

10

20

30

40

BW /

Thro

ughp

ut (M

bps) BltBW real bw. queue_len

0

20

40

60

Queu

e Le

ngth

(pkt

)

(a) BBR (bw. increasing)

2 4 6 8 10 12 14 16Time (s)

0

10

20

30

40

BW /

Thro

ughp

ut (M

bps) BltBW real bw. queue_len

0

100

200

300

Queu

e Le

ngth

(pkt

)

(b) BBR (bw. decreasing)

2 4 6 8 10 12 14 16Time (s)

0

10

20

30

40

BW /

Thro

ughp

ut (M

bps) BltBW real bw. queue_len

0

20

40

60

Queu

e Le

ngth

(pkt

)

(c) BBRv2 (bw. increasing)

2 4 6 8 10 12 14 16Time (s)

0

10

20

30

40

BW /

Thro

ughp

ut (M

bps) BltBW real bw. queue_len

0

100

200

300

Queu

e Le

ngth

(pkt

)

(d) BBRv2 (bw. decreasing)

Fig. 11: Responsiveness to bandwidth increases (a, c) and decreases (b, d). The red line represents the BtlBWestimation of BBR/BBRv2, and the green line indicates the real bandwidth of the bottleneck. The dark line showsthe dynamics of the bottleneck link’s queue length. Note that the spikes of queue length in (a) and (c) are causedby the periodical bandwidth probing of BBR/BBRv2, and the sudden drop of queue len around 10s in (b) and (d)is because BBR/BBRv2 enters the ProbeRTT state.

0 20 40 60 80 100 120Jitter (ms)

0

10

20

30

40

Tput

. (M

bps)

BBRBBRv2Cubic

Fig. 12: Avg. throughput against different levels ofjitters. x = 0 is equivalent to no jitters.

connects to H3. The mean value of the Gaussiandistribution varies from 0∼120 ms to emulate dif-ferent degrees of jitters.

Fig. 12 shows the average value and the stan-dard deviation of throughput of Cubic, BBR, andBBRv2 under various levels of jitter. Comparedwith Cubic, both BBR and BBRv2 experience lowthroughput under high jitters. As documented byKumar et al. [16], BBR underestimates RTprop insuch networks because it uses a recent 10s minimumRTT to approximate RTprop, leading to cwnd ex-haustion. This problem still exists in BBRv2, evenif BBRv2 updates RTprop 2× frequently than BBR

(i.e. BBRv2 uses the minimum RTT in recent 5sto estimate RTprop). We also note that the signifi-cant throughput degradation starts when the aver-age jitter reaches the path RTT without jitter (i.e.40ms).

3.7. Summary and Implication

We observe that BBRv2 improves the inter-protocol fairness and RTT fairness, and reduces re-transmission rates under shallow buffers, at the costof slow responsiveness to bandwidth dynamics andlow resilience to random loss.

First, the root cause for the slow responsivenessis that BBRv2 is over-conservative regarding band-width probing. That said, it fails to achieve agood balance between the aggressiveness in prob-ing for more bandwidth and the fairness againstloss-based CCAs. Note that, however, recklesslyincreasing BBRv2’s aggressiveness in bandwidthprobing may lead BBRv2 to generate overwhelmingretransmissions and unfairly share bandwidth withloss-based CCAs. In the next section, we proposeBBRv2+, which incorporates delay information tocautiously guide the aggressiveness of bandwidthprobing to avoid reducing the fairness against loss-based CCAs. The challenge is how to effectively

9

Page 10: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

Fig. 13: BBRv2+ architecture. The parts that differsfrom BBRv2 are highlighted in red color.

use this signal and how to avoid being suppressedby other loss-based CCAs in deep-buffered networksas other delay-based CCAs.

Second, the resilience to random loss can be im-proved by raising the loss threshold α, where thevalue of α needs to be set according to the bottle-neck buffer size.

Last, the throughput degradation of BBR andBBRv2 in high-jitter networks is own to the un-derestimation of RTprop, which in turn leads toa smaller estimation of BDP. We propose a com-pensation mechanism of BDP that enables the es-timated BDP to be close to the real BDP.

4. Design and implementation of BBRv2+

Motivated by our measurement results, we designand implement BBRv2+, in order to address thepitfalls of BBRv2 while maintaining its advantagesover BBR (i.e. improved fairness and reduced re-transmissions in shallow-buffered networks). Thebasic idea is to incorporate delay information inBBRv2+’s path model to balance between the ag-gressiveness in probing for more bandwidth and thefairness against loss-based CCAs (§4.2). That said,BBRv2+ tries to be more aggressive than BBRv2,where the aggressiveness is guided by the delay in-formation. As the use of the delay information maylead BBRv2+ to perform poorly when it co-existswith loss-based CCAs, a dual-mode mechanism isintroduced in BBRv2+, where BBRv2+ switches touse BBRv2’s state-machine (i.e. invalidating the ef-fect of the delay information) or returns back to usethe redesigned state-machine depending on whetherloss-based competitors co-exist (§4.3). Moreover,

BBRv2+ compensates the estimated BDP when de-tecting high jitters in order to get an accurate esti-mation of BDP. (§4.4).

4.1. Overview

The architecture of BBRv2+ is shown in Fig. 13.BBRv2+ incorporates delay information in its pathmodel. Specifically, the delay information consistsof three state variables (the first three variableslisted in Table. 1) of minimum RTTs, which re-flect the change of queuing delay over time. Thedelay information facilitates quick responsivenessto bandwidth dynamics. Particularly, a new sub-state, ProbeTry, is added into the ProbeBW state.In ProbeTry, BBRv2+ slightly speeds up to exam-ine if this acceleration will lead to increased RTTs.In the case of increased RTTs, BBRv2+ quits thisprobing and moves to the ProbeDown state to drainthe queue at the bottleneck link; otherwise, it movesto the ProbeUp state to further explore availablebandwidth. BBRv2+ also uses the delay informa-tion to quickly adapt to bandwidth decreases—itquickly updates its bottleneck bandwidth estima-tion to the current bandwidth measurement if anobvious increase of RTT is observed when BBRv2+is not probing for bandwidth.

Like other CCAs that use delay-based signals,BBRv2+ will be suppressed when co-existing withloss-based CCAs under deep buffers [41], as theloss-based CCAs constantly fill the buffer, leadingBBRv2+ to falsely yield up obtained bandwidth.BBRv2+ uses a dual-mode mechanism that forcesBBRv2+ to use BBRv2’s state-machine when loss-based CCAs co-exist.

Finally, BBRv2+ uses a BDP compensationmechanism to address the cwnd exhaustion prob-lem caused by network jitters. Our key observationis that in high-jitter networks, the BDP will be un-derestimated because of the underestimation of RT-prop. The mechanism compensates BDP by takingthe recent RTT variations into consideration; thiscompensation mitigates the underestimation issuesignificantly.

4.2. Redesign of the ProbeBW state

In the case of bandwidth increments, BBRv2+needs to start probing for more bandwidth quicklyinstead of spending time on cruising with the cur-rent estimated BtlBW. Thus, the probing intervalneeds to be reasonably shortened, which is set toapproximately match the probing interval of BBR

10

Page 11: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

Variable Functionality

MinRTTprev rtt The minimum RTT measured in the previous RTT round.MinRTTcurr rtt The minimum RTT measured in the current RTT round.MinRTTbefore probe Saving the MinRTTcurr rtt before entering ProbeUp.MinRTTcurr cruise The minimum RTT measured in the current ProbeCruise state.

Max4RTT(jitter)The max filter tracking the maximum jitter in recent four RTT rounds. Here, thejitter is equivalent to the RTT variation maintained by TCP.

Table 1: The new state variables in BBRv2+

(8 rounds of RTT). However, if BBRv2+ is al-ready sending at the speed close to bottleneck band-width, the probing interval above may result inmore packet losses and thus unfairness against loss-based CCAs in shallow-buffered networks.

Thus, a two-step probing mechanism incorporat-ing the delay information is introduced in BBRv2+,as shown in Fig. 13. A new sub-state ProbeTry,which lasts for two RTTs, is inserted before en-tering ProbeUp in the state machine. In the firstRTT of ProbeTry, BBRv2+ slightly increases itspacing rate by increasing pacing gain to 1.1. Inthe second RTT, BBRv2+ reduces pacing gain to1.0 and monitors if MinRTTcurr rtt is larger thanγ × MinRTTprev rtt, where MinRTT is measuredon the ACKs for the packets sent in the previousround, thus, reflecting the queuing delay causedby the previous round (see Table 1). The ratio-nale of using γ > 1 is to introduce a relaxing fac-tor tolerate noises in RTT measurements, where asmall γ may lead BBRv2+ to miss some chancesto explore bandwidth while a large γ may makeBBRv2+ over-aggressive. In our current implemen-tation, we set γ = 1.02 to tolerate noises for 2% ofRTT measurements. It is worth noting that γ is adesign parameter and can be tuned by designers4.If MinRTTcurr rtt > γ ×MinRTTprev rtt, BBRv2+transits to ProbeDown with pacing gain as 0.9 todrain the queue accumulated during the first RTTof ProbeTry. Otherwise, it enters ProbeUp to probefor more bandwidth.

To further boost the speed of bandwidth dis-covery, BBRv2+ also incorporates a continuousprobing mechanism based on the delay informa-tion . Specifically, at the end of ProbeUp, if theMinRTTcurr rtt ≤ γ×MinRTTbefore probe, BBRv2+

4All the design parameters of BBRv2+ in our cur-rent implementation are exposed to user-space through the/sys/module interfaces, enabling designers to change the pa-rameters without recompiling the kernel module.

re-enters ProbeUp. The rationale behind this isthat there is possible more free bandwidth capacityas no significant increment of queuing delay arisesin the current ProbeUp sub-state.

In the case of bandwidth decrements, BBRv2+needs to update its BtlBW estimation to new band-width measurements as soon as possible. Whenbandwidth decreases, BBRv2+ sends data fasterthan the bottleneck bandwidth and packets accu-mulate in the buffer of the bottleneck link. We thusalso leverage the delay information to detect band-width decrement. If BBRv2+ is in ProbeCruiseor ProbeDown, on the receipt of a new ACK inan RTT round, Algorithm 1 is called5. The reasonthat the algorithm is only applicable in ProbeCruiseor ProbeDown is to eliminate the impact on delayvariations caused by ProbeTry and ProbeUp sub-states. In Algorithm 1, BBRv2+ expires its currentBtlBW estimation if MinRTTcurr rtt is larger thanthe recently measured minimum RTT by θ times.θ is a parameter to balance the speed to convergeto new bandwidth and the resistance to noises inbandwidth measurements. A small θ may lead tothroughput oscillation, while a large θ may reduceBBRv2+’s responsiveness to bandwidth dynamics.We recommend θ ∈ [1.05, 1.15] according to our ex-periences.

Algorithm 1: Advance BtlBW max filter

Input : conn: BBRv2+ TCP connection1 target rtt ←− θ ∗ conn.RTprop2 should advance ←− (conn.MinRTTcurr rtt >

target rtt)3 if should advance then4 expire the oldest value(conn.BtlBW )

5It is called once at maximum for every RTT round toavoid expiring the BtlBW estimation too frequently.

11

Page 12: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

4.3. Dual-mode mechanism

Due to the use of the delay information to guidethe aggressiveness in bandwidth probing, BBRv2+will be starved by loss-based CCAs under deepbuffers, suffering from the similar problem existingin most delay-based CCAs [41]. The root cause isthat loss-based CCAs constantly fill the bottleneckbuffer, where BBRv2+ falsely treats the incrementsof RTT as the signal of bandwidth decrements.

As BBRv2+ periodically drains the bottleneckbuffer, during which the measured minimum RTT(MinRTTcurr cruise listed in Table. 1) is close to RT-prop if no loss-based competitor exists. By com-paring the MinRTTcurr cruise with the recorded RT-prop value, BBRv2+ estimates the existence of loss-based competitors. If loss-based competitors co-exist, BBRv2+ switches to use BBRv2’s ProbeBWstate, which enables BBRv2+ to co-exist with loss-based CCAs in the same way as BBRv2 that doesnot yield up obtained bandwidth due to RTT in-crements. Further, if the loss-based competitors nolonger exist, BBRv2+ returns back to use the re-designed ProbeBW state. We note that the dual-mode mechanism does not switch BBRv2+ to useBBRv2’s ProbeBW state if the bottleneck bufferis very shallow, because loss-based CCAs can notbloat the bottleneck buffer and BBRv2+ will notbe starved.

Algorithm 2: The dual-mode mechanism

Input : conn: BBRv2+ TCP connection1 if conn.probe bw mode = BBRv2+ then2 switch thld ← λ1 ∗ conn.RTprop3 if conn.MinRTTcurr cruise > switch thld

then4 conn.buffer filling++5 else6 conn.buffer filling ← 07 if conn.buffer filling ≥ η1 then8 conn.probe bw mode ← BBRv29 restart from startup(conn)

10 else11 switch thld ← λ2 ∗ conn.RTprop12 if conn.MinRTTcurr cruise ≤ switch thld

then13 conn.buffer empty++14 else15 conn.buffer empty ← 016 if conn.buffer empty ≥ η2 then17 conn.probe bw mode ← BBRv2+

The dual-mode mechanism is detailed in Algo-rithm 2, which runs at the end of ProbeCruise. Ifthe sender is running in BBRv2+’s ProbeBW stateand has not seen RTT samples close to the RTpropfor a number of (η1) successive ProbeCruise sub-states, it switches to use BBRv2’s ProbeBW stateand restarts itself from Startup (line 1–9). We notethat restart from startup(conn) in line 9 is aheuristic to quickly regain the bandwidth that hasbeen potentially yielded up to loss-based competi-tors by BBRv2+ recently. If the sender’s ProbeBWstate is BBRv2 and it has seen low RTTs for η2successive ProbeCruise sub-states, it returns backto use BBRv2+’s ProbeBW state because the com-petitors are most likely gone. We note that thefour parameters in Algorithm 2, λ1, λ2, η1, and η2,are to control the sensitivity of BBRv2+ to the co-existence of loss-based CCAs. In practice, we used1.1, 1.05, 2, and 4 for λ1, λ2, η1, and η2 respectively.Nevertheless, these parameters can be tuned to fitspecific networks in user space in our current im-plementation.

4.4. Compensation for BDP estimation

We have seen in §3.6, when network jitters arehigh, BBRv2 (also BBR) underestimates RTprop,thus the BDP of the network path, leading to cwndexhaustion and thus performance degradation. Toboost BBRv2+’s performance under high networkjitters, BBRv2+ takes network jitters into accountwhen estimating the BDP of the network path.

BBRv2+ compensates the BDP estimation witha component proportional to RTT variations whennetwork jitters are high, which is detailed in Algo-rithm. 3. As instantaneous RTT variations couldbe very dynamic, to ensure that BBRv2+ cantolerate jitters up to the maximum extent, weuse the recently measured maximum RTT varia-tion, Max4RTT(jitter) in Table. 1, as the indicatorof recent jitters. When Max4RTT(jitter) exceedsµ × RTprop, the estimated RTprop is increased tothe sum of the original RTprop and the delay vari-ation (Max4RTT(jitter)) to mitigate the underesti-mation of RTprop. We recommend setting µ around0.5 because the performance of BBRv2 starts to de-grade when jitters approach half of RTprop as ob-served in §3.6.

4.5. Implementation

BBRv2+ is implemented as a Linux kernel mod-ule (∼2100 LoCs), based on Google’s BBRv2 al-pha kernel module [31]. Therefore, it is easy to

12

Page 13: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

Algorithm 3: Compensating BDP estima-tionInput : conn: BBRv2+ TCP connectionOutput: the BDP estimation of BBRv2+

1 jitter ←− conn.Max4RTT(jitter)2 threshold ←− µ ∗ conn.RTprop3 fixed RTprop ← conn.RTprop4 if jitter > threshold then5 fixed RTprop ← fixed RTprop + jitter6 return conn.BtlBW ∗ fixed RTprop

deploy BBRv2+ on the hosts where BBRv2 is al-ready in use. The parameters of BBRv2+ are ex-posed to user-space through the /sys/module in-terfaces, which allows users to change the param-eters according to their need without recompilingthe kernel module. The code of BBRv2+ is open-sourced on Github [28] to the research communityfor further test and improvement.

5. Evaluation of BBRv2+

In this section, we evaluate BBRv2+ based onboth Mininet-based emulation and real-world tracedriven emulation. First, we describe our experi-ment setup in §5.1. We then evaluate the benefitsof BBRv2+ from the perspectives of the responsive-ness to bandwidth changes (§5.2) and the resilienceto network jitters (§5.3). Next, we demonstratethat BBRv2+ is able to keep the advantages ofBBRv2 in inter-protocol fairness (§5.4), RTT fair-ness (§5.5), and low retransmissions (§5.6). Finally,we evaluate the performance of BBRv2+ throughreal-world trace driven emulation in §5.7.

5.1. Evaluation setup

Two testbeds are used for the evaluation ofBBRv2+. One is the Mininet-based testbed usedin §3, as a controlled environment to evaluateBBRv2+ from various perspectives. The other isbased on Mahimahi [42], a trace-driven emulatorthat can accurately replay real-world packet-leveltraces, as illustrated in Fig. 14. The physical serverrunning the two testbeds is the same one as thatused to evaluate BBRv2 in §3. The toolset for dataanalysis and traffic generation is also the same asthat in §3.

In all experiments in this section, the α (the lossthreshold) of BBRv2+ is set to 20% as the valueis suitable for most of the buffer sizes according tothe results in §3.4.

Fig. 14: Mahimahi testbed

5.2. Responsiveness to bandwidth dynamics

To evaluate BBRv2+’s responsiveness to band-width dynamics, we use the same settings as thatin §3.5, in order to run BBRv2+ to have a micro-scopic view on how it reacts to bandwidth dynam-ics. Fig. 15 shows the results.

In Fig. 15a, when there is no bandwidth incre-ment, BBRv2+ only enters ProbeTry for a veryshort duration and finishes bandwidth probing verysoon, which leads to an instantaneous short stand-ing queue. However, when the bandwidth is in-creased, BBRv2+ can timely adapt its’ BtlBW es-timation to the real bandwidth. Compared with theresults of BBR in Fig. 11a and BBRv2 in Fig. 11c,BBRv2+ is capable to utilize newly available band-width as quick as BBR, while its guided probingstrategy (by the delay information) incurs lowerqueuing delay than BBR.

In Fig. 15b, when bandwidth decreases, BBRv2+notices that the queuing delay is obviously risingup via increased RTT (see §4.2). It expires the oldBtlBW estimation and adapts its BtlBW estima-tion to the available bandwidth. Compared withthe results of BBR and BBRv2 in Fig. 11, BBRv2+adapts its sending rate to the decreased bandwidthmuch faster, which leads to lower queuing delay.

Next, we compare the responsiveness of Cubic,BBR, BBRv2, and BBRv2+ to bandwidth dynam-ics in our trace-driven emulation testbed Mahimahi,using five synthesized network traces where thebandwidth changes as step functions, as illustratedin Fig. 16. Following the settings in [2], we set thebuffer size to 1.5MB, the delay to 20ms, and the lossrate to zero. In each experiment, the flow through-put, as well as the sojourn time of each packet inthe buffer of the bottleneck link (denoted as queu-ing delay), are recorded.

To compare the overall performance of all CCAson a network trace, we normalized the averagequeuing delay and the average throughput of allCCAs to the minimum average queuing delay and

13

Page 14: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

2 4 6 8 10 12 14 16Time (s)

0

10

20

30

40

BW /

Thro

ughp

ut (M

bps) BltBW real bw. queue_len

0

20

40

60

Queu

e Le

ngth

(pkt

)

(a) bw. increasing

2 4 6 8 10 12 14 16Time (s)

0

10

20

30

40

BW /

Thro

ughp

ut (M

bps) BltBW real bw. queue_len

0

100

200

300

Queu

e Le

ngth

(pkt

)

(b) bw. decreasing

Fig. 15: BBRv2+’s responsiveness to bandwidth increases (a) and decreases (b). The red line represents the BtlBWestimation of BBRv2+, and the green line indicates the real bandwidth of the bottleneck. The dark line shows thedynamics of the bottleneck link’s queue length.

Fig. 16: An example of traces with bandwidth chang-ing as a step function.

1510152025303540Normalized queuing delay

0.0

0.2

0.4

0.6

0.8

1.0

Norm

alize

d th

roug

hput

Better

BBRBBRv2BBRv2+Cubic

Fig. 17: Normalized throughput and queuing delayof different CCAs. (markers: average throughput andqueuing delay; left end of the lines: 95%-tile of queuingdelay; ellipses: the standard deviations)

the maximum average throughput achieved on thattrace, respectively. In addition, we also normal-ized the 95%tile queuing delay of all CCA on a net-work trace to the minimum average queuing de-lay achieved on that trace. Then, we averagedall normalized values over all traces. The resultsare shown in Fig. 17. We observe that BBRv2+achieved significantly higher throughput and lowerqueuing delay than BBRv2, and lower queuing de-lay at the cost of slightly lower throughput thanBBR. These observations stem from the facts that:(1) BBRv2+ probes for bandwidth at a frequency

similar to BBR’s one, thus, achieving high band-width utilization as BBR does; (2) BBRv2+ adaptsits sending rate to decreased bandwidth faster thanBBR as it quickly updates its BtlBW estimationupon increased queuing delay.

5.3. Resilience to network jitters

Next, we evaluate the performance of BBRv2+under network jitters, using the same settings asin §3.6. The throughput of BBRv2+, Cubic, BBRand BBRv2 under various levels of jitters are shownin Fig. 18a. Different from BBR and BBRv2, thethroughput of BBRv2+ does not degrade when thenetwork jitters become larger. Fig. 18b furtherplots the average inflight bytes of four CCAs; theresults confirm that the BDP compensation mech-anism of BBRv2+ succeeds to increase the inflightsize for higher throughput when the network jitterbecomes larger. Nevertheless, the throughput ofBBRv2+ is slightly lower than Cubic. The reasonis that our compensation to BDP is a bit conserva-tive; in contrast, Cubic’s cwnd can grow far beyondthe real BDP as it is not affected by network jitters.

0 20 40 60 80 100 120Jitter (ms)

0

10

20

30

40

Tput

. (M

bps) BBR

BBRv2BBRv2+Cubic

(a) Throughput

0 20 40 60 80 100 120Jitter (ms)

0

2

4

Avg.

infli

ght b

ytes

(MB)

BBRBBRv2BBRv2+Cubic

(b) Avg. inflight bytes

Fig. 18: Resilience to network jitters

14

Page 15: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

5.4. Inter-protocol fairness

The inter-protocol fairness of BBRv2+ is evalu-ated using the same settings as that in §3.2.1. Weconsidered BBRv2+ with/without the dual-modemechanism (see §4.3) to study the impact of thismechanism on inter-protocol fairness. The resultsare shown in Fig. 19. Several observations are no-table.

First, the results demonstrate the efficacy of thedual-mode mechanism. In Fig. 19a, we can observethat BBRv2+ without the dual-mode mechanismis starved by Cubic in deep-buffered cases. Thisis because BBRv2+ falsely treats the RTT incre-ments caused by Cubic as a signal of bandwidthshrinking, thus, constantly yielding up bandwidthto Cubic. The problem is eliminated by the dual-mode mechanism as shown in Fig. 19b.

Second, compared with the results ofBBRv2(20%, 0.3) in Fig. 10, we can see that:(1) BBRv2+ provides better inter-protocolfairness than BBRv2(20%, 0.3) under an ex-tremely shallow buffer (i.e. 0.2× BDP); (2)BBRv2+’s inter-protocol fairness is similar to thatof BBRv2(20%, 0.3) under other buffer sizes. Thereason for the better inter-protocol fairness ofBBRv2+ under an extremely shallow buffer is thatBBRv2+ does not enter ProbeUp, which is moreaggressive than ProbeTry, thanks to the two-stepprobing mechanism (see §4.2) while BBRv2(20%,0.3) periodically enters ProbeUp.

Third, compared with the results of BBR inFig. 4a and BBRv2 in Fig. 4b, BBRv2+ performsno worse than the better one among BBR andBBRv2 under different buffer sizes. The reasonsare three-fold: (1) under shallow buffers, BBRv2+achieves similar inter-protocol fairness to that ofBBRv2 thanks to its cautiously aggressive band-width probing strategy (see §4.2); (2) under mod-erate buffers, BBRv2+ is close to BBRv2(20%, 0.3)that has better inter-protocol fairness than BBRv2in these cases as explained in §3.4; (3) under deepbuffers, the three CCAs perform closely as they allhave an inflight cap around 2× BDP, thus, unableto beat loss-based CCAs.

5.5. RTT fairness

Next, we evaluate the RTT fairness of BBRv2+using the same setting as that in §3.2.2. The re-sults are presented in Fig. 20. Compared with theresults of BBR in Fig. 5a and those of BBRv2 inFig. 5b, BBRv2+ has better RTT fairness than

5075

100

(%)

0.2 0.5 1 2 3 4 8 16 32Buffer Size (BDP)

010203040

Tput

. (M

bps)

BBRv2+Cubic

(a) Without the dual-modemechanism

5075

100

(%)

0.2 0.5 1 2 3 4 8 16 32Buffer Size (BDP)

010203040

Tput

. (M

bps) BBRv2+

Cubic

(b) With the dual-modemechanism

Fig. 19: BBRv2+: Inter-protocol fairness

5075

100

(%)

0.2 0.5 1 2 3 4 8 16 32Buffer Size (BDP)

010203040

Tput

. (M

bps) BBRv2+/40ms

BBRv2+/150ms

Fig. 20: BBRv2+: RTT fairness

BBR and behaves close to BBRv2. The resultsare expected as the mechanisms in BBRv2 thatimproves the RTT fairness over BBR (see §3.2.2)remain unchanged in BBRv2+.

5.6. Retransmissions in shallow-buffered networks

In the following, we evaluate whether BBRv2+is as aggressive as BBR to lead to excessive re-transmissions in shallow-buffered networks, usingthe same setting as that in §3.3. The resultsare shown in Fig. 21. We observe that when thebuffer is extremely shallow (e.g. 0.02× BDP whenthe bandwidth is 500Mbps and the RTT is 75ms),BBRv2+ incurs more retransmissions than BBRv2.This is because the bandwidth probing frequency ofBBRv2+ is higher than that of BBRv2. AlthoughBBRv2+ uses a relatively small pacing gain (1.1)when it starts to probe for more bandwidth in Pro-beTry, it still causes buffer overflow when the net-work buffer is extremely shallow. However, com-pared with the results of BBR in Fig. 6a, BBRv2+reduces retransmissions significantly.

5.7. Real-world trace driven emulation

To evaluate how BBRv2+ performs in real net-work conditions, we compare BBRv2+ with Cubic,BBR, BBRv2, and Orca [2] in the emulation-based

15

Page 16: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

5 10 25 50 75 100

150

RTT ms ->

750500250100

502010

BW M

bps -

>

0

2

4

6

8

Retx

Rat

e %

Fig. 21: BBRv2+: retransmission rate (100KB buffer)

Mahimahi testbed, using traces collected in real-world networks. Orca6 is used for comparison as arepresentative of the state-of-the-art learning-basedCCAs.Trace collection: We collected traces from WiFiand LTE networks, using saturatr [39]. In total, 20network traces are collected, half of which are col-lected when the collector is stationary to the basestation (LTE) or the Access Point (WiFi), and theother half are collected when the collector is mov-ing at high speeds (i.e. in vehicles or on high-speedrails). In stationary scenarios, the network band-width is usually stable, while in high-mobility sce-narios, the bandwidth fluctuates greatly. Examplesof stationary and high-mobility traces are shown inFig. 22. The network delay and loss rate are alsomeasured using ping.

(a) An example of stationarytraces

(b) An example of high-mobility traces

Fig. 22: Example of traces used in our trace-drivenevaluation

Experimental results: The network buffer size isset to 1.5MB. The collected traces, including band-width dynamics, network delay, and loss rate arethe inputs of the emulator. Using the same metricsas that in §5.2, the results for the stationary andhigh-mobility scenarios are shown in Fig. 23a andFig. 23b respectively.

In stationary scenarios, BBR, BBRv2, andBBRv2+ perform very close to each other because

6We directly used the model trained by the authors inour experiments.

the bandwidth is usually stable. Cubic showsslightly better throughput for most of the time atthe cost of high queuing delays. Orca has the lowestthroughput probably because the network scenariowhere the model was trained is different from ourcollected traces, which also demonstrates the limi-tation of learning-based CCAs.

In high-mobility scenarios, BBR and BBRv2+achieve the highest and the second-highest through-put respectively. Meanwhile, BBRv2 and Cu-bic fail to achieve consistently high throughputacross different high-mobility traces. Comparedwith BBRv2+, BBR achieves higher throughput atthe cost that it incurs higher queuing delays as it ismore aggressive. Orca fails to achieve consistentlyhigh throughput and low delays in high-mobilityscenarios; the results of Orca raise a concern on thegeneralization ability of learning-based CCAs.

The above results of trace-driven emulation us-ing Mahimahi demonstrate that BBRv2+ performsclosely to BBR and BBRv2 in stationary net-work scenarios, but shows great improvements overBBRv2 in high-mobility scenarios as it has betterresponsiveness to bandwidth dynamics.

5.8. Summary of experimental results

We can conclude from the above experimentsthat BBRv2+ succeeds to balance the aggressive-ness of bandwidth probing and the fairness againstloss-based CCAs. With such a balance, which isneither achieved by BBR nor BBRv2, BBRv2+achieves higher throughput and lower delay thanBBRv2 in scenarios where the bandwidth fluctu-ates, while keeping the advantages of BBRv2 withregard to inter-protocol fairness and reduced re-transmissions under shallow buffers. Moreover, thedual-mode mechanism makes BBRv2+ able to co-exist with loss-based CCAs under deep buffers andthe compensation mechanism for BDP estimationefficiently enhances the performance of BBRv2+under high network jitters.

6. Related work

BBR evaluation: Since BBR [8] was released byGoogle in 2016, it has been examined under variousnetwork conditions by researchers [12–17]. BBRis unfair when sharing a bottleneck link with Cu-bic flows or BBR flows with different RTTs [12–14, 17]. In particular, BBR flows are always ableto claim at least 35% of the total bandwidth in

16

Page 17: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

1.02.55.07.510.0

12.5

15.0

17.5

20.0

Normalized queuing delay

0.0

0.2

0.4

0.6

0.8

1.0

Norm

alize

d th

roug

hput

Better

BBRBBRv2BBRv2+

CubicOrca

(a) Stationary traces

151015202530Normalized queuing delay

0.0

0.2

0.4

0.6

0.8

1.0

Norm

alize

d th

roug

hput

Better

BBRBBRv2BBRv2+

CubicOrca

(b) High-mobility traces

Fig. 23: Normalized throughput and queuing delay of different CCAs. (markers: average throughput and queuingdelay, left end of the lines: 95%-tile of queuing delay, ellipses: the standard deviations of the throughput and queuingdelay)

deep-buffered networks when competing with Cu-bic flows [13]. This percentage, however, dependson the link capacity and delay, the bottleneck buffersize, and the number of BBR flows [15]. In shallow-buffered networks, BBR can lead to massive re-transmissions [14]. Moreover, the throughput ofBBR collapses when network jitters are high, ei-ther in experimental emulation [16] or networks inhigh-speed train scenarios [27].

BBR enhancement: The issues of BBR identi-fied by empirical studies motivated optimizationson BBR from various perspectives. For instance,[16, 27] proposed several modifications in the RT-prop estimation of BBR to counter against networkjitters, [17, 43, 44] improved BBR’s RTT fairness,[45, 46] improved BBR’s inter-protocol fairness withloss-based CCAs, and [47] reduced BBR’s aggres-siveness in shallow-buffered networks to suppressthe unnecessary retransmissions.

BBRv2 evaluation: Google proposed BBRv2 [18]to solve the problems identified in BBR. In Google’searly tests [18, 20], BBRv2 shows better inter-protocol fairness with loss-based CCAs and re-duced retransmissions in shallow-buffered networks.There are also several evaluations on BBRv2 [19,21–23]. Gomez et al. [19] and Nadagiri et al. [21]studied the inter-protocol fairness and RTT fairnessof BBRv2 through emulation. Kfoury et al. [23]evaluated BBRv2 in emulated networks and foundthat BBRv2 eliminates the massive retransmissionsin shallow-buffered networks. Song et al. [22] foundthat BBRv2 cannot quickly utilize the newly avail-able link capacity when the bottleneck bandwidthincreases.

Our work differs from the above studies in twoperspectives: (1) from the measurement perspec-tive, we not only systematically evaluate the perfor-mance of BBRv2 under various network conditions,but also analyze the reasons behind the observedperformance issues; (2) from the optimization per-spective, we propose BBRv2+ that addresses theshortcomings of BBRv2 while barely sacrificing itsadvantages in fairness and reduced retransmissions.

7. Conclusion

In this paper, we first comprehensively evaluatedBBRv2, revealed its pros and cons over BBR, andanalyzed the reasons behind BBRv2’s performanceissues. Motivated by the results of BBRv2’s evalu-ation, we propose BBRv2+ that incorporates de-lay information in its path model and redesignsthe ProbeBW state to achieve a good balance be-tween the aggressiveness of bandwidth probing andthe fairness to loss-based CCAs. Extensive experi-ments demonstrate that BBRv2+ significantly im-proves the performance over BBRv2, especially inhigh-mobility scenarios, while barely sacrificing theadvantages of BBRv2.

References

[1] V. Jacobson, Congestion avoidance and control, Com-put. Commun. Rev. 25 (1995) 157–187.

[2] S. Abbasloo, C.-Y. Yen, H. J. Chao, Classic MeetsModern: a Pragmatic Learning-Based CongestionControl for the Internet, in: Proceedings of the Annualconference of the ACM Special Interest Group onData Communication on the applications, technologies,

17

Page 18: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

architectures, and protocols for computer communi-cation, ACM, Virtual Event USA, 2020, pp. 632–647.doi:10.1145/3387514.3405892.URL https://dl.acm.org/doi/10.1145/3387514.

3405892

[3] K. Winstein, H. Balakrishnan, TCP ex Machina:Computer-Generated Congestion Control 12.

[4] N. Jay, N. H. Rotman, B. Godfrey, M. Schapira,A. Tamar, A deep reinforcement learning perspectiveon internet congestion control, in: ICML, 2019.

[5] M. Dong, T. Meng, D. Zarchy, E. Arslan, Y. Gilad,B. Godfrey, M. Schapira, Pcc vivace: Online-learningcongestion control, in: NSDI, 2018.

[6] F. Y. Yan, J. Ma, G. D. Hill, D. Raghavan, R. S. Wahby,P. Levis, K. Winstein, Pantheon: the training groundfor internet congestion-control research, in: USENIXAnnual Technical Conference, 2018.

[7] I. Rhee, L. Xu, S. Ha, A. Zimmermann, L. Eggert,R. Scheffenegger, CUBIC for Fast Long-Distance Net-works, Tech. Rep. RFC8312, RFC Editor (Feb. 2018).doi:10.17487/RFC8312.URL https://www.rfc-editor.org/info/rfc8312

[8] N. Cardwell, Y. Cheng, C. Gunn, S. Yeganeh, V. Jacob-son, Bbr: Congestion-based congestion control, Queue14 (2016) 20 – 53.

[9] A. Mishra, X. Sun, A. Jain, S. Pande, R. Joshi,B. Leong, The Great Internet TCP Congestion Con-trol Census, Proceedings of the ACM on Measurementand Analysis of Computing Systems 3 (3) (2019) 1–24.doi:10.1145/3366693.URL https://dl.acm.org/doi/10.1145/3366693

[10] S. H. Yeganeh, P. Jha, Y. Seung, L. Hsiao, M. Mathis,V. Jacobson, BBR Updates: Internal Deployment,Code, Draft Plans 8.

[11] L. Kleinrock, Power and deterministic rules of thumbfor probabilistic problems in computer communications,1979.

[12] Y. Cao, A. Jain, K. Sharma, A. Balasubramanian,A. Gandhi, When to use and when not to use BBR:An empirical analysis and evaluation study, in: Pro-ceedings of the Internet Measurement Conference,ACM, Amsterdam Netherlands, 2019, pp. 130–136.doi:10.1145/3355369.3355579.URL http://dl.acm.org/doi/10.1145/3355369.

3355579

[13] D. Scholz, B. Jaeger, L. Schwaighofer, D. Raumer,F. Geyer, G. Carle, Towards a Deeper Understand-ing of TCP BBR Congestion Control, in: 2018 IFIPNetworking Conference (IFIP Networking) and Work-shops, 2018, pp. 1–9. doi:10.23919/IFIPNetworking.

2018.8696830.[14] M. Hock, R. Bless, M. Zitterbart, Experimental evalua-

tion of BBR congestion control, in: 2017 IEEE 25th In-ternational Conference on Network Protocols (ICNP),2017, pp. 1–10. doi:10.1109/ICNP.2017.8117540.

[15] R. Ware, M. K. Mukerjee, S. Seshan, J. Sherry, Mod-eling BBR’s Interactions with Loss-Based CongestionControl, in: Proceedings of the Internet MeasurementConference, ACM, Amsterdam Netherlands, 2019, pp.137–143. doi:10.1145/3355369.3355604.URL http://dl.acm.org/doi/10.1145/3355369.

3355604

[16] R. Kumar, A. Koutsaftis, F. Fund, G. Naik, P. Liu,Y. Liu, S. Panwar, TCP BBR for Ultra-Low La-tency Networking: Challenges, Analysis, and Solutions,

in: 2019 IFIP Networking Conference (IFIP Network-ing), 2019, pp. 1–9, iSSN: 1861-2288. doi:10.23919/

IFIPNetworking.2019.8816856.[17] S. Ma, J. Jiang, W. Wang, B. Li, Fairness of

Congestion-Based Congestion Control: ExperimentalEvaluation and Analysis, arXiv:1706.09115 [cs]ArXiv:1706.09115.URL http://arxiv.org/abs/1706.09115

[18] N. Cardwell, Y. Cheng, S. H. Yeganeh, I. Swett,V. Vasiliev, P. Jha, Y. Seung, M. Mathis, V. Jacobson,BBR v2 A Model-based Congestion Control 36.

[19] J. Gomez, E. Kfoury, J. Crichigno, E. Bou-Harb,G. Srivastava, A Performance Evaluation of TCPBBRv2 Alpha, in: 2020 43rd International Confer-ence on Telecommunications and Signal Processing(TSP), IEEE, Milan, Italy, 2020, pp. 309–312.doi:10.1109/TSP49548.2020.9163512.URL https://ieeexplore.ieee.org/document/

9163512/

[20] N. Cardwell, Y. Cheng, S. H. Yeganeh, P. Jha, Y. Se-ung, I. Swett, V. Vasiliev, B. Wu, M. Mathis, V. Jacob-son, BBR v2: A Model-based Congestion Control IETF105 Update 21.

[21] A. Nandagiri, M. P. Tahiliani, V. Misra, K. K.Ramakrishnan, BBRvl vs BBRv2: Examining Per-formance Differences through Experimental Eval-uation, in: 2020 IEEE International Symposiumon Local and Metropolitan Area Networks (LAN-MAN, IEEE, Orlando, FL, USA, 2020, pp. 1–6.doi:10.1109/LANMAN49260.2020.9153268.URL https://ieeexplore.ieee.org/document/

9153268/

[22] Y.-J. Song, G.-H. Kim, I. Mahmud, W.-K. Seo,Y.-Z. Cho, Understanding of BBRv2: Evalu-ation and Comparison with BBRv1 Conges-tion Control Algorithm, IEEE Access (2021) 1–1doi:10.1109/ACCESS.2021.3061696.URL https://ieeexplore.ieee.org/document/

9361674/

[23] E. F. Kfoury, J. Gomez, J. Crichigno, E. Bou-Harb,An emulation-based evaluation of TCP BBRv2 Alphafor wired broadband, Computer Communications 161(2020) 212–224. doi:10.1016/j.comcom.2020.07.018.URL https://linkinghub.elsevier.com/retrieve/

pii/S014036642030092X

[24] M. Zhang, M. Polese, M. Mezzavilla, J. Zhu, S. Ran-gan, S. Panwar, M. Zorzi, Will TCP work in mmWave5G Cellular Networks?, arXiv:1806.05783 [cs]ArXiv:1806.05783.URL http://arxiv.org/abs/1806.05783

[25] D. Chitimalla, K. Kondepu, L. Valcarenghi, M. Tor-natore, B. Mukherjee, 5G fronthaul-latency and jit-ter studies of CPRI over ethernet, IEEE/OSA Jour-nal of Optical Communications and Networking 9 (2)(2017) 172–182, conference Name: IEEE/OSA Jour-nal of Optical Communications and Networking. doi:

10.1364/JOCN.9.000172.[26] J. D. Beshay, A. T. Nasrabadi, R. Prakash, A. Francini,

Link-Coupled TCP for 5G networks, in: 2017IEEE/ACM 25th International Symposium on Qual-ity of Service (IWQoS), 2017, pp. 1–6. doi:10.1109/

IWQoS.2017.7969170.[27] J. Wang, Y. Zheng, Y. Ni, C. Xu, F. Qian, W. Li,

W. Jiang, Y. Cheng, Z. Cheng, Y. Li, X. Xie, Y. Sun,Z. Wang, An Active-Passive Measurement Study of

18

Page 19: BBRv2+: Towards Balancing Aggressiveness and Fairness with ...

TCP Performance over LTE on High-speed Rails, in:The 25th Annual International Conference on MobileComputing and Networking, ACM, Los Cabos Mexico,2019, pp. 1–16. doi:10.1145/3300061.3300123.URL https://dl.acm.org/doi/10.1145/3300061.

3300123

[28] Y. Furong, yangfurong/BBRv2plus (Jul. 2021).URL https://github.com/yangfurong/BBRv2plus

[29] M. Alizadeh, A. Greenberg, D. Maltz, J. Padhye, P. Pa-tel, B. Prabhakar, S. Sengupta, M. Sridharan, Data cen-ter tcp (dctcp), in: SIGCOMM ’10, 2010.

[30] Mininet: An Instant Virtual Network on Your Laptop(or Other PC) - Mininet.URL http://mininet.org/

[31] google/bbr.URL https://github.com/google/bbr/tree/v2alpha

[32] tc-netem(8) - Linux manual page.URL https://www.man7.org/linux/man-pages/man8/

tc-netem.8.html

[33] esnet/iperf.URL https://github.com/esnet/iperf

[34] TCPDUMP/LIBPCAP public repository.URL https://www.tcpdump.org/

[35] tcptrace(1): TCP connection analysis tool - Linux manpage.URL https://linux.die.net/man/1/tcptrace

[36] tc(8) - Linux manual page.URL https://man7.org/linux/man-pages/man8/tc.8.

html

[37] S. Floyd, Metrics for the Evaluation of CongestionControl Mechanisms, RFC 5166 (Mar. 2008). doi:

10.17487/RFC5166.URL https://rfc-editor.org/rfc/rfc5166.txt

[38] B. Huffaker, M. Fomenkov, D. Plummer, D. Moore,K. Claffy, Distance metrics in the internet, 2002.

[39] K. Winstein, A. Sivaraman, H. Balakrishnan, Stochas-tic Forecasts Achieve High Throughput and Low Delayover Cellular Networks 13.

[40] L. Li, K. Xu, T. Li, K. Zheng, C. Peng, D. Wang,X. Wang, M. Shen, R. Mijumbi, A measurementstudy on multi-path TCP with multiple cellularcarriers on high speed rails, in: Proceedings ofthe 2018 Conference of the ACM Special InterestGroup on Data Communication - SIGCOMM ’18,ACM Press, Budapest, Hungary, 2018, pp. 161–175.doi:10.1145/3230543.3230556.URL http://dl.acm.org/citation.cfm?doid=

3230543.3230556

[41] R. Al-Saadi, G. Armitage, J. But, P. Branch, A Surveyof Delay-Based and Hybrid TCP Congestion ControlAlgorithms, IEEE Communications Surveys Tutorials21 (4) (2019) 3609–3638, conference Name: IEEE Com-munications Surveys Tutorials. doi:10.1109/COMST.

2019.2904994.[42] R. Netravali, A. Sivaraman, S. Das, A. Goyal, K. Win-

stein, J. Mickens, H. Balakrishnan, Mahimahi: Ac-curate Record-and-Replay for HTTP, in: ATC, 2015,p. 14.

[43] M. Yang, P. Yang, C. Wen, Q. Liu, J. Luo, L. Yu,Adaptive-BBR: Fine-Grained Congestion Control withImproved Fairness and Low Latency, in: 2019 IEEEWireless Communications and Networking Conference(WCNC), 2019, pp. 1–6, iSSN: 1558-2612. doi:10.

1109/WCNC.2019.8885527.[44] G.-H. Kim, Y.-Z. Cho, Delay-Aware BBR Con-

gestion Control Algorithm for RTT FairnessImprovement, IEEE Access 8 (2020) 4099–4109.doi:10.1109/ACCESS.2019.2962213.URL https://ieeexplore.ieee.org/document/

8943219/

[45] Y. Zhang, L. Cui, F. P. Tso, Modest BBR: EnablingBetter Fairness for BBR Congestion Control, in: 2018IEEE Symposium on Computers and Communications(ISCC), 2018, pp. 00646–00651, iSSN: 1530-1346. doi:

10.1109/ISCC.2018.8538521.[46] Y.-J. Song, G.-H. Kim, Y.-Z. Cho, BBR-CWS: Im-

proving the Inter-Protocol Fairness of BBR, Electronics9 (5) (2020) 862. doi:10.3390/electronics9050862.URL https://www.mdpi.com/2079-9292/9/5/862

[47] I. Mahmud, G.-H. Kim, T. Lubna, Y.-Z. Cho,BBR-ACD: BBR with Advanced Congestion Detec-tion, Electronics 9 (1) (2020) 136. doi:10.3390/

electronics9010136.URL https://www.mdpi.com/2079-9292/9/1/136

19