Page 1
SCTP PERFORMANCE IMPROVEMENT BASED ON ADAPTIVE
RETRANSMISSION TIME-OUT ADJUSTMENT
THESIS
Presented to the Graduate Councilof Texas State University-San Marcos
in Partial Fulfillmentof the Requirements
for the Degree
Master of SCIENCE
by
Sagun Khatri, B.A.
San Marcos, Texas
August 2011
Page 2
SCTP PERFORMANCE IMPROVEMENT BASED ON ADAPTIVE
RETRANSMISSION TIME-OUT ADJUSTMENT
Committee Members Approved:
Wuxu Peng, Chair
Stan McClellan
Hongchi Shi
Approved:
J. Michael Willoughby
Dean of the Graduate College
Page 3
FAIR USE AND AHTHOR’S PERMISSION STATEMENT
Fair Use
This work is protected by the Copyright Laws of the United States (Public Law 94-553, section 107). Consistent with fair use as defined in the Copyright Laws, briefquotations from this material are allowed with proper acknowledgement. Use of thismaterial for financial gain without the author’s express written permission is notallowed.
Duplication Permission
As the copyright holder of this work I, Sagun Khatri, authorize duplication of thiswork, in whole or in part, for educational or scholarly purposes only.
Page 4
TABLE OF CONTENTS
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vLIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
CHAPTER
1. INTRODUCTION 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 Layout of the Thesis . . . . . . . . . . . . . . . . . . . . . . . 5
2. BACKGROUND 62.1 Commonalities between SCTP and TCP . . . . . . . . . . . . 62.2 Differences between SCTP and TCP . . . . . . . . . . . . . . 72.3 Retransmission . . . . . . . . . . . . . . . . . . . . . . . . . . 92.4 Jacobson’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . 102.5 Performance Deterioration of Jacobson’s Algorithm . . . . . . 112.6 Karn’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 132.7 Fast Retransmission Timeout . . . . . . . . . . . . . . . . . . 142.8 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3. ADAPTIVE RTO MIN (ARM) ALGORITHM 183.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.2 Research Logistics . . . . . . . . . . . . . . . . . . . . . . . . . 193.3 Adaptive RTO MIN (ARM) Algorithm . . . . . . . . . . . . . 193.4 Data Gathering for Multiple Payloads . . . . . . . . . . . . . . 23
3.4.1 Performance Evaluation for 50 bytes Payload . . . . . . 273.4.2 Performance Evaluation for 500 bytes Payload . . . . . 273.4.3 Performance Evaluation for 1000 bytes Payload . . . . 283.4.4 Performance Evaluation for 2000 bytes Payload . . . . 28
3.5 Algorithm Comparison Chart . . . . . . . . . . . . . . . . . . 293.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4. CONCLUSIONS AND FUTURE WORK 31BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
iv
Page 5
LIST OF TABLES
Table 3.1 Data in this table are the outcome of the Adaptive RTO algo-rithm implementation. . . . . . . . . . . . . . . . . . . . . . . . . . 22
Table 3.2 Data Gathered using the static RTO MIN (SRM) and AdaptiveRTO MIN algorithm (ARM) executed on multiple payload, andfile sizes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Table 3.3 This chart shows us that when a chunk’s payload size is rela-tively small the Adaptive RTO algorithm preforms better thanthe static RTO MIN. . . . . . . . . . . . . . . . . . . . . . . . . . . 29
v
Page 6
LIST OF FIGURES
Figure 1.1 In this diagram the X-axis represents number of RTO updatesand the Y-axis represents time in milli seconds. . . . . . . . . . . . . 3
Figure 1.2 In this figure the X-axis represents number of RTO updates,and the Y-axis represents time in milliseconds. . . . . . . . . . . . . 4
Figure 2.1 The Jacobson’s algorithm . . . . . . . . . . . . . . . . . . . . . . . 10Figure 2.2 In this figure we see how the Jacobson’s algorithm currently
behaves in SCTP. . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Figure 2.3 This is a zoomed-in version of above Figure 2.2. . . . . . . . . . . . . 12Figure 2.4 Karn’s Algorithm - the Retransmission Ambiguity Problem . . . . . . 13Figure 3.1 The SCTP Echo Server running at the Texas State University–
Texas State Computer Science Department. . . . . . . . . . . . . . . 20Figure 3.2 The Adaptive RTO algorithm divides the possible RTT values
into five sectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21Figure 3.3 Adaptive RTO algorithm. . . . . . . . . . . . . . . . . . . . . . . . 21Figure 3.4 This graph is generate from the data in the Table 3.1. . . . . . . . . . 23Figure 3.5 Without the Adaptive RTO MIN algorithm, the static RTO
MIN holds the RTO from falling below 1, 000 milliseconds. . . . . . . 24
vi
Page 7
CHAPTER 1
INTRODUCTION
1.1 Background
The Stream Control Transmission Protocol (SCTP) is a new IP transport protocol,
existing at an equivalent level with User Datagram Protocol (UDP) and Transmis-
sion Control Protocol (TCP), which provides transport layer functions to many In-
ternet applications. SCTP has been approved by the IETF as a Proposed Standard
[Ong and Yoakum, 2002].
Like TCP, SCTP provides a reliable transport service, ensuring that data is
transported across the network without error and in sequence. Like TCP, SCTP
is session-oriented mechanism, meaning that a relationship is created between the
endpoints of an SCTP association prior to data being transmitted, and this rela-
tionship is maintained until all data transmission has been successfully completed
[Ong and Yoakum, 2002]. The word “association” is used in SCTP instead of “con-
nection” to avoid the connotation that a connection involves communication be-
tween only two IP addresses. An association refers to a communication between
two systems, which may involve more than two IP addresses due to multihoming
[Stevens et al., 2004].
Unlike TCP, SCTP provides a number of functions that are critical for telephony
signaling transport, and at the same time can potentially benefit other applications
1
Page 8
2
needing transport with additional performance and reliability [Ong and Yoakum, 2002].
1.2 Motivation
In recent years there have been a significant number of changes with regard to network
infrastructure. In the 1980s the Round Trip Time (RTT) for a packet to travel from
one side of the Continental US, for example, from New York City to the other side,
for example, Los Angeles California, used to take around 200 milliseconds. Currently
a packet traveling from New York City to Los Angeles could easily travel in less than
60 milliseconds. The decrease in time taken for a packet to travel a distance shows
us that there is a major improvement in the network infrastructure. With the help
of new technologies, such as fiber cables, and better satellite communication, faster
data transfer between two endpoints is going to be an ongoing trend.
Even though the network infrastructure has improved significantly over the years,
some of the network algorithms have not been fine tuned to take advantage of the
infrastructure improvement. Some of the network algorithms designed in the late
1980s are still being used Jacobson’s algorithm is a good example at this trend.
Jacobson’s algorithm calculates the Retransmission Time-Out (RTO) for each Round
Trip Time (RTT) [Stevens et al., 2004]. The algorithm was designed with the network
infrastructure of the late 1980s, where bandwidth was not as abundant.
1.3 Objectives
This work focuses on improving the file transfer time for the Stream Control Transmis-
sion Protocol (SCTP). SCTP borrows many features from TCP, but there are some
Page 9
3
440 5 10 15 20 25 30 35 40
3000
0
500
1000
1500
2000
2500
Number of RTO and RTT Updates
Tim
e (m
illis
econ
ds) RTO
RTT
RTO MIN
Waste
Waste
Waste
Figure 1.1: In this diagram the X-axis represents number of RTO updates and the Y-axisrepresents time in milli seconds.
areas where the SCTP needs fine tuning in order to take advantage of its unique fea-
tures, such as multi-homing. One particular area where the SCTP needs improvement
is the implementation of the Retransmission Time-Out Minimum (RTO MIN).
The Retransmission Time-Out Minimum (RTO MIN) constant in SCTP is set
very high (i.e., 1000 milliseconds) as seen in Figure 1.1 and Figure 1.2. Jacobson’s
algorithm does not respond to any sporadic changes in the RTT value mainly due to
the RTO MIN, as seen in Figure 1.2. The result is waste of valuable system resources,
that could have been used for transmitting few more packets of data.
The subject matter of this thesis is not only to improve upon the time taken
to transfer a file form one node to the other, but also to make sure that the new
algorithm does not have any side effects, like bandwidth congestion. To accomplish
Page 10
4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
1100
0
100
200
300
400
500
600
700
800
900
1000
Number of RTO and RTT Updates
Tim
e (m
illis
econ
ds)
RTO
RTT
RTO MIN
Waste
Figure 1.2: In this figure the X-axis represents number of RTO updates, and the Y-axisrepresents time in milliseconds.
this we proposed to fine tune the RTO MIN to make SCTP aware of a packet loss in
a significantly shorter amount of time. Thus, the retransmission will be expedited.
With static RTO MIN (SRM), Jacobson’s algorithm is idle majority of the time,
as seen in Figure 1.2. In this thesis we propose a new algorithm called Adaptive RTO
MIN (ARM) algorithm, which dynamically lowers the lower bound of the RTO, thus
forcing Jacobson’s algorithm to engage in the RTO’s calculation.
The retransmission timer is a key feature of a reliable link or transport layer
protocol. It can greatly influence peer-to-peer performance. A too optimistic re-
transmission often expires prematurely. Such an event is called spurious timeout
[Ludwig and Sklower, 2000]. Thus, the ARM algorithm should avoid spurious time-
outs. However, fine tuning the RTO MIN should not make the RTO too optimistic.
Page 11
5
Otherwise, there is a risk of high bandwidth consumption because of spurious re-
transmissions. Detecting packet loss faster, should result in a faster file transfer as
compared to the existing algorithm.
1.4 Layout of the Thesis
Chapter 1: Introduction to SCTP. Describes the fundamental differences between
SCTP, TCP, and UDP.
Chapter 2: Provide a detailed background information on SCTP and research that
is related to this thesis.
Chapter 3: Detailed description on the Adaptive RTO MIN algorithm.
Chapter 4: Conclusion and Future Work.
Page 12
CHAPTER 2
BACKGROUND
2.1 Commonalities between SCTP and TCP
SCTP shares many features of TCP. For example:
• Both TCP and SCTP are developed to achieve the highest possible throughput
in various network scenarios. Thus, they will try to make use of all available
bandwidth in the network to transmit data as fast as possible to remote users.
This is known as thick stream. An example of thick stream would be a file trans-
fer from one node to the other. When very few packets are sent without the need
to make use of the available bandwidth, and those packets are small compared
to the available payload. This is known as thin stream [Pedersen, 2006].
• SCTP and TCP adjusts the sending rate to avoid overwhelming both the re-
ceiver and the network. Limiting the sending rate to avoid overwhelming a
receiver is called flow control. Limiting the sending rate to avoid overwhelming
the network is called congestion control [Matthews, 2005].
• SCTP and TCP maintains another limit called the congestion window. The con-
gestion window typically starts at the 1 Maximum Segment Size (MSS) and then
increases with each segment that is successfully acknowledged [Matthews, 2005].
6
Page 13
7
• In SCTP and TCP the congestion window grows multiplicatively with each ac-
knowledgement. This phase of multiplicative growth is called slow start. How-
ever, this multiplicative growth does not continue forever, there is a adaptively
determined threshold [Matthews, 2005].
• In SCTP data can be transmitted in one or more streams within a single associ-
ation and subjected to common congestion and flow control. These mechanisms
are based on TCP. This means that SCTP is using slow start and congestion
avoidance in its procedures. During slow-start the initial congestion window is
set to 2× Maximum Transmission Unit (MTU). During congestion avoidance,
the congestion window is increased by 1× MTU per RTT.
2.2 Differences between SCTP and TCP
SCTP differs from TCP in fundamental ways, which is why there is a need to optimize
algorithms to better suit SCTP, rather than just copy and paste the code from TCP.
Following are some key differences between the two protocols:
• Unlike TCP, SCTP is message-oriented. It provides sequence delivery of individ-
ual records. Like User Datagram Protocol (UDP), the length of a record written
by the sender is passed to the receiving application [Stevens et al., 2004].
• SCTP can provide multiple streams between connection endpoints, each with
its own reliable sequenced delivery of messages. A lost message in one of these
streams does not block delivery of messages in any other streams. This approach
is in contrast to TCP, where a loss at any point in the single stream of bytes
Page 14
8
blocks delivery of all future data on the connection until the loss is repaired
[Stevens et al., 2004].
• SCTP also provides a multihoming feature, which allows a single SCTP end-
point to support multiple IP addresses. This feature can provide increased
robustness against network failure. An endpoint can have multiple redundant
network connections, where each of these networks has a different connection
to the Internet. SCTP can work around a failure of one network or path across
the Internet by switching to another address already associated with the SCTP
association. The word “association” is used in SCTP instead of “connection” to
avoid the connotation that a connection involves communication between only
two IP addresses [Stevens et al., 2004].
• In SCTP a message from the application layer is transmitted in a data chunk
which has its own unique Transmission Sequence Number (TSN). Several chunks
for different types may get bundled into one packet as long as the total size of a
packet does not exceed the MTU of the network path. If a message does not fit
into a single packet according to the MTU, it is fragmented into multiple data
chunks where each fits into a packet [Stewart et al., 2000].
• SCTP uses SACK to acknowledge the receipt of data chunks. In the absence of
loss, a SACK is sent back for every second packet received or within 200 milli
seconds of the arrival of any unacknowledged data chunks.
Page 15
9
2.3 Retransmission
When a SCTP sender transmits a chunk, it also sets a timer called a retransmission
timer. When an acknowledgment arrives, the timer is cancelled. If the timer expires
before an acknowledgment arrives, the chunk will be retransmitted.
SCTP does not always wait for a retransmission timer to expire before retrans-
mitting data. SCTP will also interpret a series of duplicate acknowledgments as an
early sign of packet losses [Matthews, 2005]. Fast retransmission occurs when four
Selective Acknowledgements (SACKs) is received, and this is discussed in details in
later sections.
The RTT between a client and a server can change rapidly with time as net-
work conditions change. For optimal performance a timeout and retransmission
algorithm should be used that takes into account the actual RTT’s characteristics
along with changes in the RTT over time. Much work has been focused on this
area, mostly relating to TCP, but the same ideas apply to any network application
[Allman and Paxson, 1999, Coene and Pastor-Balbas, 2006].
The retransmission time-out (RTO) value is the time that elapses after a packet
was sent until the sender considers it lost and retransmits it, this event is called a
timeout. The RTO is a prediction of the upper limit of the round trip time (RTT), i.e.,
the time that elapses after a packet left the sender until the sender receives a positive
acknowledgment (ACK) for that packet. The time that remains until the timeout for
a packet occurs is maintained by the retransmission timer state (REXMT). Thus, the
RTO is the REXMT’s initial value [Ludwig and Sklower, 2000].
Page 16
10
delta = measuredRTT − srtt
srtt ← srtt + g × delta
rttvar ← rttvar + h(|delta| − rttvar)
RTO = srtt + 4× rttvar
Figure 2.1: The Jacobson’s algorithm
The retransmission timer is the key feature of a reliable link or transport layer
protocol. It can greatly influence peer-to-peer performance. A too optimistic retrans-
mission timer often expires prematurely. Such an event is called spurious timeout. It
causes unnecessary traffic, so called spurious retransmissions, that reduce a connec-
tion’s effective throughput. On the other hand, a retransmission timer that is too
conservative may cause long idle time before the lost packet is retransmitted. This
can also degrade performance [Ludwig and Sklower, 2000].
2.4 Jacobson’s Algorithm
We want to calculate the RTO to use for every packet that we send. To calculate this,
we measure the RTT: the actual round-trip time for a packet. Every time we measure
an RTT, we update two statistical estimators: srtt is the smoothed RTT estimator and
rttvar is the smoothed mean deviation estimator. The latter is a good approximation
of the standard deviation, but easier to compute since it does not involve a square
root. Given these two estimators, the RTO is assigned as srtt plus four times rttvar
[Jacobson and Karels, 1988] provides all the details to these calculations, which we
can summarize in Figure 2.1:
In Figure 2.1 delta is the difference between the measured RTT and the current
Page 17
11
smoothed RTT estimator (srtt), g is the gain applied to the RTT estimator and equals
18, and h is the gain applied to the mean deviation estimator and equals 1
4.
Another point made in [Jacobson and Karels, 1988] is that when the retransmis-
sion timer expires, an exponential backoff must be used for the next RTO. For exam-
ple, if our first RTO is 2 seconds and the reply is not received in this time, then the
next RTO is 4 seconds. If there is still no reply, the next RTO is 8 seconds, and then
16, and so on [Stevens et al., 2004].
2.5 Performance Deterioration of Jacobson’s Algorithm
A retransmission timer that is too conservative may cause long idle time before the lost
packet is retransmitted. This can degrade performance [Ludwig and Sklower, 2000].
In SCTP, RTO MIN is a constant that keeps the RTO from falling below 1, 000
milliseconds. During the research, there was a strong reverberation through the static
RTO MIN causing Jacobson’s algorithm performance deterioration.
RTO MIN’s very liberal value of 1, 000 milliseconds keeps the Jacobson’s algorithm
from playing a proactive role in RTO calculation, as seen in Figure 2.2 and Figure 2.3.
In many circumstances (due to modern broadband networks) the RTT is well below
1, 000 milliseconds. Thus, most of the time Jacobson’s algorithm is never used for
the RTO calculation, which causes long idle time before realizing a packet has been
lost. In other words, SCTP has to wait for almost a whole second, before realizing a
packet has been lost. This approximately 1, 000 millisecond loss in the long run will
turns out to be very costly in terms of transmission time.
Page 18
12
550180 250 300 350 400 450 500
4500
0
500
1000
1500
2000
2500
3000
3500
4000
Number of RTO and RTT Updates
Tim
e (m
illis
econ
ds)
RTO
RTT
Figure 2.2: In this figure we see how the Jacobson’s algorithm currently behaves in SCTP.
14040 50 60 70 80 90 100 110 120 130
4500
0
500
1000
1500
2000
2500
3000
3500
4000
Number of RTO and RTT Updates
Tim
e (m
illis
econ
ds)
RTO
RTT
Figure 2.3: This is a zoomed-in version of above Figure 2.2.
Page 19
13
client server
reply
request
request
reply
client server
request
lost
request
reply
(a) lost request
client server
request
replylost
request
reply
(b) lost reply (c) RTO too short
{RTO
Figure 2.4: Karn’s Algorithm - the Retransmission Ambiguity Problem
2.6 Karn’s Algorithm
Another point made in [Jacobson and Karels, 1988] is that when the retransmission
timer expires, an exponential backoff must be used for the next RTO. For example,
if our first RTO is 2 seconds and the reply is not received in this time, then the next
RTO is 4 seconds. If there is still no reply, the next RTO is 8 seconds, and then 16,
and so on.
Jacobson’s algorithm is used to calculate the RTO each time an RTT is measured,
and how to increase the RTO for retransmission. However, a problem arises for
ACK of retransmitted packets. This is called the retransmission ambiguity problem.
Figure 2.4 shows the following three possible scenarios when the retransmission timer
expires.
(a) The request is lost.
(b) The reply is lost.
Page 20
14
(c) The RTO is too small.
When the client receives a reply to a request that was retransmitted, it cannot
tell to which request the reply corresponds. In the example (refer to Figure 2.4) the
reply corresponds to the original request, while in the two other examples, the reply
corresponds to the retransmitted request.
Karn’s algorithm [Karn and Partridge, 1991] handles this scenario with the follow-
ing rules that apply whenever a reply is received for a request that was retransmitted.
• If an RTT was measured, do not use it to update the estimators since it is not
known to which request the reply corresponds.
• Since this reply arrived before the retransmission time expired, reuse this RTO
for the next packet. Only when a reply is received to a request that was not
retransmitted will RTT estimators be updated and RTO recalculated.
2.7 Fast Retransmission Timeout
In SCTP, fast retransmission is triggered by four SACKs. Whenever the sender
receives a SACK that reports missing data chunks, it will wait for three further SACKs
reporting the same data chunks as missing before doing the fast retransmission. By
waiting for four consecutive SACKs, SCTP tries to eliminate spurious retransmissions
caused by packets that are received out of order [Pedersen, 2006].
Page 21
15
2.8 Related Work
A lot of research has been done to improve late retransmission. This section focuses
on what others have done in order to improve the late retransmission.
Early Fast Retransmit (EFR) is an optional mechanism in FreeBSD, which is
active whenever the congestion window is larger than the number of unacknowledged
packets and there are packets to be sent. It starts a timer that closely follows the
RTT and RTTVAR for every packet sent, and when the timer goes off and the stream
is still not using the entire congestion window, it retransmits all packets that could
have been acknowledged [Pedersen et al., 2006].
Much research has been done in the quest to improve Jacobson’s algorithm, but so
far there has not been a definite answer. Ekstrom and Ludwig [Ekstrom and Ludwig, 2004]
indicate that the RTO algorithm defined in RFC2988 [Paxson and Allman, 2000],
used in both TCP and SCTP, responds inappropriately to certain fluctuation in RTT.
This causes the characteristics spike in RTO when there is a sudden movement in
RTT, as seen in Figure 2.2. The reason behind the spike is that the RTTVAR compu-
tation does not distinguish between positive and negative variations. Their proposed
algorithm alleviates the sudden fluctuation for a wide range of cases, and the findings
in [Pedersen et al., 2006] concurs with their algorithm. However, their solution on
average is higher than that proposed in RFC2988 [Paxson and Allman, 2000], which
is not a optimal solution [Petlund et al., 2009].
The “Early Retransmit (ER)” algorithm [Allman et al., 2010] suggests that a
mechanism should be in place to recover lost segments when there are few unac-
Page 22
16
knowledged packets to trigger Fast Retransmit. The Early Retransmit algorithm
reduces waiting time in four situations [Petlund et al., 2009]:
• The congestion window is still initially small.
• It is small because of heavy loss.
• Flow control limits the send window size.
• The application has no data to send.
The RTO MIN is an important factor in calculating the RTO itself. Some appli-
cations may want to lower the RTO MIN to less than 1, 000 milliseconds, which will
allow the sender to reach the maximum number of retransmission threshold faster in
the case of network failures. However, Allman and Paxson [Allman and Paxson, 1999]
warn that lowering the RTO MIN may have negative impact on network behavior.
Where as, some applications might want to eliminate using the binary exponen-
tial back-off concept in the RTO calculation in order to speed up failure detection.
The RFC4166 suggests not to eliminate the binary exponential back-off altogether,
because when network congestion does occur, not backing off the timer may worsen
the congestion situation [Coene and Pastor-Balbas, 2006].
2.9 Summary
Even though there has been a tremendous amount of research in this area, there is
still a great deal of fine tuning to be done in terms of enhancing the retransmission
timeout algorithms. In the next chapter we are going to give a detailed description
Page 23
17
of the algorithm we call Adaptive RTO MIN (ARM) algorithm, which exploits the
variations in RTT and dynamic range of RTO. This algorithm improves performance
of SCTP by implementing a dynamically variable minimum value for RTO. Previously,
the minimum value of RTO (RTO MIN) has been statically defined.
Page 24
CHAPTER 3
ADAPTIVE RTO MIN (ARM) ALGORITHM
3.1 Introduction
The idea behind the Adaptive RTO algorithm comes from observing the Jacobson’s
algorithm secluded role in calculating the RTO with the lower bound set by the static
RTO MIN (SRM). With the SRM, Jacobson’s algorithm is idle the majority of the
time. The Adaptive RTO MIN (ARM) algorithm dynamically lowers the lower bound
of the RTO, thus forcing Jacobson’s algorithm to engage in the RTO’s calculation.
RFC-2960 defines RTO MIN as a constant, with a value of one second, which
is 1, 000 milliseconds [Stewart et al., 2000]. The primary purpose of the RTO MIN
is to act as a lower bound for the RTO, i.e. RTO’s value can not fall below that
threshold. The RTO MIN constant makes the RTO very unresponsive to RTT’s
sporadic behavior. For example even if the RTT is at 700 milliseconds the RTO will
not make any adjustments accordingly, as seen in Figure 2.2.
The only way Jacobson’s Algorithm can play a proactive role in the RTO calcula-
tion is if the RTT value is in the vicinity of 800 to 900 milliseconds. In our research,
we found that waiting 1000 milliseconds for the retransmission timer to expire is a
waste of time, since as mentioned before the RTT for the Continental US is about 60
milliseconds. If there are some spurious RTT’s once in a while, then why not catch
it early and retransmit it for faster file transfer?
18
Page 25
19
Modern bandwidth is big enough for us to afford to do such retransmissions, and
as the results from the data collection shows that doing so will not be retransmitting
any more than the current algorithm. In some occasions the Adaptive RTO algorithm
can save retransmission from happening and wasting resources.
3.2 Research Logistics
For this research we setup three computers, as seen in Figure 3.1. Two computers
running an SCTP echo client program residing in an external network, and the third
computer running an SCTP echo server program, which resides in the Texas State
University–San Marcos Computer Science Department. All three of them were run-
ning Linux CentOS 5.4, kernel version 2.6.18, with the default SCTP module provided
by the kernel.
The physical distance between the computers was less than 50 miles. The RTT
between the two client computers and the server was typically around 30 milliseconds.
To measure the time taken, the Adaptive RTO MIN (ARM) algorithm and the
static RTO MIN (SRM), we transmitted the same file from two client computers,
where one is running the ARM algorithm and the other SRM. In addition, the client
computers were transmitting the file in 10 streams, with the data being mirrored in
all 10 streams, as diagramed in Figure 3.1.
3.3 Adaptive RTO MIN (ARM) Algorithm
The ARM algorithm calculates RTO MIN based on the current value of RTT. Dy-
namically lowering the RTO MIN forces Jacobson’s algorithm to play a proactive role
Page 26
20
ExternalNetwork
InternetSCTP Echo Server
SCTP Echo Client 1
SCTP Echo Client 2
University Network
Router Router
RTT 28 to 32 ms
10 Streams 10 Streams
Figure 3.1: The SCTP Echo Server running at the Texas State University–Texas StateComputer Science Department.
in calculating RTO.
The RTO Adaptive Algorithm calculates the RTO MIN value based on the current
RTT with the principle of exponential decay. This is diagramed in Figure 3.2 and
Figure 3.3.
There are two reasons for choosing the multiplicative values in the Adaptive RTO
MIN algorithm:
• Multiplication of 2, 1.75, 1.50, 1.25, 1.125, is easier and efficient to calculate
using a right bit shift and an addition operation, rather than a floating point
calculations.
• When the RTT is in the lower range, from 1 to 50 milliseconds, it is safe to
double the RTO MIN value, because the RTO MIN will at most have an upper
bound of 100 milliseconds. Whereas, doubling the RTT MIN value while the
Page 27
21
8000 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750
2.25
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
RTT Value
RT
O M
IN R
atio
n
Figure 3.2: The Adaptive RTO algorithm divides the possible RTT values into five sectors.
1 i f ( r t t <= 50)2 rto min = r t t ∗ 23 else i f ( r t t <= 100)4 rto min = r t t ∗ 1 .755 else i f ( r t t <= 200)6 rto min = r t t ∗ 1 .57 else i f ( r t t <= 400)8 rto min = r t t ∗ 1 .259 else
10 rto min = r t t ∗ 1 .125
Figure 3.3: Adaptive RTO algorithm.
Page 28
22
Table 3.1: Data in this table are the outcome of the Adaptive RTO algorithm implemen-tation.
RTO MIN and RTO After the Adaptive RTO Algorithm
Round Trip Time Retransmission Time-Out Retransmission Time-Out Minimum
29 85 5833 78 6628 66 5632 64 6462 108 108106 159 159184 286 276215 401 26840 356 8040 320 8036 291 7244 252 8860 211 10560 175 10570 152 12281 150 14185 148 14889 155 15590 157 15793 162 162
RTT is in the range of 201 to 400 milliseconds, is not wise because the RTO
MIN’s upper bound will be almost the same as static RTO MIN. Thus, using
an exponential decay principle in the Adaptive RTO MIN algorithm made the
most sense.
Figure 3.4 shows the resulting performance of the ARM algorithm. Note that
because of the ARM algorithm the RTO, which is calculated by Jacobson’s algorithm,
is not over estimating as was the case with the static RTO MIN of 1, 000 milliseconds.
By lowering the RTO MIN value and allowing it to update dynamically based
Page 29
23
220 2 4 6 8 10 12 14 16 18 20
450
0
50
100
150
200
250
300
350
400
Number of RTO, RTT, and RTO MIN Updates
Tim
e (m
illis
econ
ds)
RTO
RTT
RTO MIN
Figure 3.4: This graph is generate from the data in the Table 3.1.
on current RTT values, Jacobson’s algorithm becomes more proactive in the RTO
calculation, but with enough room for the RTT to fluctuate and not over-congest the
network bandwidth.
As shown in Figure 3.4, the ARM algorithm gives enough room for the RTO to
make correct decision, but at the same time it does not let the RTO MIN fall too far
below. In a sense the RTO MIN is acting like a net for the RTO. Before, the RTO
constantly hovered at 1, 000 milliseconds even if the RTT was around 30 milliseconds.
3.4 Data Gathering for Multiple Payloads
To verify the performance of Adaptive RTO algorithm a wealth of data were gath-
ered, using the network configuration of Figure 3.1, using multiple payload sizes, and
multiple file sizes. The payload size ranged from 50 bytes — 2000 bytes, and the file
Page 30
24
220 2 4 6 8 10 12 14 16 18 20
1100
0
100
200
300
400
500
600
700
800
900
1000
Number of RTO, RTT, and RTO MIN Updates
Tim
e (m
illis
econ
ds)
RTO
RTT
RTO MIN
Figure 3.5: Without the Adaptive RTO MIN algorithm, the static RTO MIN holds theRTO from falling below 1, 000 milliseconds.
sizes ranged from 32 kilobytes — 2048 kilobytes (2 megabytes). The interpretation
of this data shows that the ARM algorithm preforms better with smaller payloads.
As seen in Table 3.2, the data shows that the Adaptive RTO algorithm behaves
approximately the same when the congestion level in the network is the comparatively
same, i.e. there is no excessive misfiring of retransmission. The conventional thought
is that lowering the RTO MIN might have undesirable side effect, like network con-
gestion [Allman and Paxson, 1999, Coene and Pastor-Balbas, 2006]. The Adaptive
RTO algorithm and static RTO MIN behaves “approximately” the same while op-
erating in the non-congested mode. However, the Adaptive RTO algorithm still out
performs the static RTO MIN in terms of Total Transmission time, RTX due to Fast
RTX, albeit in small fractions.
Page 31
25
The advantage comes when using small packets as in “thin stream”, or control
streams. During the data collection phase the Adaptive RTO algorithm transfer rate
improved by 5%, while still retaining the same retransmission rate as the static RTO
MIN (SRM). Table 3.2 shows the findings using multiple payload, and file sizes.
In Table 3.2, each sub-tables represents different payload and file sizes. Further-
more, the sub-tables have been divided into two different columns that represents
data collected with and without congestion. Each rows of the sub-tables are defined
below:
• Number of Fast RTX, represents the total number of chunks retransmitted due
to fast retransmission.
• Number of RTX Time-Out, represents the total number chunks retransmitted
due to time out.
• Number of RTX PMTU, represents the total number of chunks retransmitted
due to the chunk’s size being greater than the maximum transmission unit.
• Transmission Time in seconds, represents the total time taken to transmit a
file.
In order to collect the data a text file was transferred from the two client computers
to the echo server, as seen in Figure 3.1. The file size is represented by each sub-tables
in Table 3.2. The network congestion was emulated by uploading a very large file in
the background from one of the client computer.
Page 32
26
Table 3.2: Data Gathered using the static RTO MIN (SRM) and Adaptive RTO MINalgorithm (ARM) executed on multiple payload, and file sizes.
Transferring 32 Kb file with 50 bytes payloadWithout Congestion With CongestionSRM ARM SRM ARM
Number of Fast RTX 164 177 1862 2005Number of RTX Time-Out 5 0 55 342Number of RTX PMTU 0 0 0 0Transmission Time (seconds) 1673 1667 5716 5498
Transferring 512 Kb File with 500 bytes payloadWithout Congestion With CongestionSRM ARM SRM ARM
Number of Fast RTX 656 159 3009 2350Number of RTX Time-Out 34 7 88 383Number of RTX PMTU 0 0 0 0Transmission Time (seconds) 2756 2724 9013 8581
Transferring 1024 Kb file with 1000 bytes payloadWithout Congestion With CongestionSRM ARM SRM ARM
Number of Fast RTX 311 181 2223 2118Number of RTX Time-Out 6 9 97 343Number of RTX PMTU 0 0 0 0Transmission Time (seconds) 2728 2721 9542 9334
Transferring 2048 Kb file with 2000 bytes payloadWithout Congestion With CongestionSRM ARM SRM ARM
Number of Fast RTX 362 290 2735 2572Number of RTX Time-Out 14 9 64 393Number of RTX PMTU 0 0 0 0Transmission Time (seconds) 2742 2723 9224 9237
Page 33
27
The data in the Table 3.2 show the performance improvement achieved by im-
plementing the ARM algorithm over the standard SRM, which will be discussed in
detail in the following sections.
3.4.1 Performance Evaluation for 50 bytes Payload
The data from the Table 3.2 suggest that the ARM algorithm preforms better with
small the payload size.
When the payload size is 50 bytes, and there is no network congestion the ARM
algorithm transfers a file in approximately the same time the SRM does. The re-
transmission rate due to both fast retransmission and time-out is approximately the
same.
Whereas, the ARM algorithm out performs the SRM when there is a network
congestion, by transferring a file 4% faster. As expected, the number of chunks re-
transmitted due to time-out increases for the ARM algorithm. But, the number of
chunks retransmitted due to time-out is very minuscule in comparison to the total
number of chunks actually transmitted. In comparison to the SRM, the ARM algo-
rithm retransmitted 0.3% more chunks in total, which as stated earlier is well within
network bandwidth’s capacity.
3.4.2 Performance Evaluation for 500 bytes Payload
When the payload size is 500 bytes, and there is no network congestion the ARM
algorithm transfers a file in approximately the same time the SRM does. The re-
transmission rate due to both fast retransmission and time-out is approximately the
Page 34
28
same as well.
The ARM algorithm preform better than the SRM when there is network con-
gestion. The ARM algorithm transferred a file 5% faster as compared to SRM. The
total number of retransmission caused by the ARM algorithm was in fact 0.1% less
than the SRM.
With the 500 bytes payload, the ARM algorithm seems to be more efficient than
with 50 bytes payload, because it transferred a file in 5% less time, with 0.1% less
retransmission.
3.4.3 Performance Evaluation for 1000 bytes Payload
With the payload size of 1, 000 bytes, and with no network congestion the ARM
algorithm transmits a file in approximately the same time as the SRM, and the
retransmission rate is the same for both.
The ARM algorithm preforms better than the SRM when there is network con-
gestion. The ARM algorithm transferred a file 2.23% faster as compared to the SRM.
The total number retransmission caused by the ARM algorithm was approximately
the same as the SRM.
3.4.4 Performance Evaluation for 2000 bytes Payload
As the payload size increases the ARM algorithm and SRM preforms the same regard-
less of with or without network congestion. The total number of chunks retransmitted
is approximately the same, and the time taken to transmit a file is same as well.
Page 35
29
Table 3.3: This chart shows us that when a chunk’s payload size is relatively small theAdaptive RTO algorithm preforms better than the static RTO MIN.
Comparison Chart
Payload Size (bytes) Without Congestion With Congestion
2, 000 ∼ ∼1, 000 ∼ +2.2%500 ∼ +5.03%50 ∼ +4%
3.5 Algorithm Comparison Chart
Table 3.2 presents us with the data for different file and payload sizes, with ARM al-
gorithm and without. Detailed interpretation of the data was provided in the previous
sections.
Table 3.3 is a comparison chart which neatly conveys the messages, provided in
the previous sections. The chart represents how the ARM algorithm compares to the
SRM, in respect to file transfer time. The convention used in the chart is as follows:
‘+’ represents a time gain caused by the ARM algorithm, and the ‘∼’ represents the
ARM algorithm’s performance is approximately the same as the SRM.
With the help of Table 3.3 it is easy to visualize the conditions that favors ARM
algorithm. The data suggest that the ARM algorithm is always better than the SRM
when the payload is relatively small, “thin stream”, and the ARM algorithm performs
approximately the same as SRM, when the payload size increases in the vicinity of
2, 000 bytes.
Page 36
30
3.6 Summary
The data gathering and analysis in this chapter have proven the Adaptive RTO MIN
(ARM) algorithm as a viable replacement to the current static RTO MIN implemen-
tation. The ARM algorithm can transmit a file from one endpoint to the other in
5% less time than the static RTO MIN, without unnecessary retransmission, and in
some case the ARM algorithm has even mitigated the amount of retransmission as
compared to the static RTO MIN.
Page 37
CHAPTER 4
CONCLUSIONS AND FUTURE WORK
In this thesis we have shown what others have done in regard to enhancing SCTP’s
retransmission time-out performance, and how their approaches are different from
ours. We have also shown how the Adaptive RTO algorithm helps improve the SCTP’s
performance in contrast to the existing implementation.
Significant amount of data were gathered in order to show that we have improved
the file transfer rate in SCTP by about 5%, in the scenario where the Adaptive RTO
algorithm is implemented. We have been vigilant in making sure the Adaptive RTO
algorithm does not unnecessarily clog network’s bandwidth, and that the current im-
plementation of RTO MIN can be safely replaced with the Adaptive RTO algorithm.
Furthermore, we have done extensive testing of the Adaptive RTO algorithm in re-
spect to multiple scenarios. For example, the data were gathered when the network
was operating on a non-congested environment, as well as when the network was op-
erating under heavy congestion environment. We have also shown that the Adaptive
RTO algorithm does not clog network’s bandwidth under any conditions.
This is an evolutionary research. This thesis does not solve all the problems but
sets up additional studies and hypothesis. If the time had permitted I would have
done the following, in my opinion, the next cycle in this evolution:
• Come up with a design and implement a congestion detection via Adaptive
31
Page 38
32
RTO algorithm, and make use of the SCTP’s unique feature, multihoming, to
switch between IP address. Currently this feature is not fully implemented in
the SCTP, the application layer needs to handle the logic to select IP addresses.
The Adaptive RTO algorithm will help detect the network’s congestion and help
make better decision to switch between multiple IP addresses.
• Come up with a design and implement better path selection algorithm via Adap-
tive RTO algorithm. The Adaptive RTO algorithm would detect a better path
for transferring data from one node to the other.
Page 39
BIBLIOGRAPHY
[Allman et al., 2010] Allman, M., Avrachenkov, K., Ayesta, U., Blanton, J., and
Hurtig, P. (2010). Early retransmit for tcp and stream control transmission protocol
(sctp).
[Allman and Paxson, 1999] Allman, M. and Paxson, V. (1999). On estimating end-
to-end network path properties. SIGCOMM.
[Coene and Pastor-Balbas, 2006] Coene, L. and Pastor-Balbas, J. (2006). Telephony
signaling transport over stream control transmission protocol (sctp) applicability
statement.
[Ekstrom and Ludwig, 2004] Ekstrom, H. and Ludwig, R. (2004). The peak-hopper:
A new end-to-end retransmission timer for reliable unicast transport. INFOCOM.
[Jacobson and Karels, 1988] Jacobson, V. and Karels, M. J. (1988). Congestion
avoidance and control. ACM Computer Communications Review.
[Karn and Partridge, 1991] Karn, P. and Partridge, C. (1991). Improving round-trip
time estimates in reliable transport protocols. ACM Transactions on Computer
Systems, 9(4):364–373.
[Ludwig and Sklower, 2000] Ludwig, R. and Sklower, K. (2000). The eifel retrans-
mission timer. ACM Computer Communications Review.
33
Page 40
34
[Matthews, 2005] Matthews, J. (2005). Computer Networking Internet Protocols In
Action. Wiley, Hoboken, NJ, 1st edition.
[Ong and Yoakum, 2002] Ong, L. and Yoakum, J. (2002). An introduction to the
stream control transmission protocol (sctp).
[Paxson and Allman, 2000] Paxson, V. and Allman, M. (2000). Computer tcp’s re-
transmission timer, rfc 2988 (proposed standard).
[Pedersen, 2006] Pedersen, J. (2006). Evaluation of sctp retransmission delays. Mas-
ter’s thesis, University of Oslo Department of Informatics.
[Pedersen et al., 2006] Pedersen, J., Griwodz, C., and Halvorsen, P. (2006). Consid-
erations of sctp retransmission delays for thin streams. LCN.
[Petlund et al., 2009] Petlund, A., Beskow, P., Pedersen, J., Paaby, E. S., Griwodz,
C., and Halvorsen, P. (2009). Improving sctp retransmission delays for time-
dependent thin streams. Multimedia Tools and Applications, 45:33–60.
[Stevens et al., 2004] Stevens, W. R., Fenner, B., and Rudoff, A. M. (2004). UNIX
Network Programming The Socket Networking API Volume 1. Addison-Wesley,
Boston, MA, 3rd edition.
[Stewart et al., 2000] Stewart, R., Xie, Q., Morneault, K., Sharp, C., Schwarzbauer,
H., Taylor, T., Rytina, I., Kalla, M., Zhang, L., and Paxson, V. (2000). Stream
control transmission protocol.
Page 41
VITA
Sagun Khatri was born in Kathmandu, Nepal, on December 27, 1981, the son of
Sridhar and Sarita Khatri. After completing his work at Galaxy Public High School
in Kathmandu, Nepal, he entered Luther College–Decorah, Iowa. In the Fall of 2006,
he received the degree of Bachelor of Arts from Luther College–Decorah, Iowa. In
Fall 2008, he entered the Graduate College of Texas State University-San Marcos.
Permanent Address: 12800 Harrisglenn Drive Apt# 1534
Austin, Texas 78753
This thesis was typed by Sagun Khatri.