CSC/ECE 573 Internet Protocols Transmission Control Protocols
Jan 01, 2016
Copyright Rudra Dutta, NCSU, Spring, 2005 2
Let’s Design TCPOr rather, a transport protocol with the following characteristics Reliability (no erroneous packets + Guaranteed in-order
delivery) Flow control Congestion control
Copyright Rudra Dutta, NCSU, Spring, 2005 3
What Reliable Transport Protocols Provide
Goal Description
Addressing Delivery of data from sending application to correct receiving application
Reliable Delivery Receiver must get all the data the sender put on the network
In-Order Delivery Data segments must be delivered to the receiver in the same order they left the sender
Flow Control Sender’s transmission rate should adapt to the receiver's rate
Congestion Control Transmission rate should not exceed the speed of slowest link
Segmentation Data should be sent in segments whose size provides the highest throughput
Copyright Rudra Dutta, NCSU, Spring, 2005 4
Transport vs. Data-Link Protocols
Both protocols have to deal with error/flow control, etc.
Data Link Layer Transport Layer
Communication is over a physical channel
Communication is over a network
Explicit addressing not always required (e.g. point-to-point)
Explicit addressing required
Establishing a (single) connection is simple
Large and variable number of connections, establishment is
more difficult
A channel cannot reorder frames The network may store packets and deliver them later
Copyright Rudra Dutta, NCSU, Spring, 2005 5
Outline
I. Overview• TCP Message Format and Options
II. Connection Establishment & Termination
III. Data Transfer (Flow and Error Control)
IV. Timer Management
V. Congestion Control
Copyright Rudra Dutta, NCSU, Spring, 2005 7
TCP OverviewTransmission Control Protocol (RFCs 793, 1122, 1323)
Segment: unit of transfer, usually contained in a single IP datagram
Reliability achieved using Acknowledgments (therefore, seq. no.s) Timeouts, retransmissions Checksums on header and data
Sliding window mechanism for efficient transmission flow control congestion control
Important: several flavors (text covers early one)
Copyright Rudra Dutta, NCSU, Spring, 2005 8
Properties of TCP Reliable Service
Point-to-point
Stream orientation: TCP thinks of data as a stream of bytes
Unstructured stream: TCP does not honor structured data
Virtual circuit / connection orientation: state maintained at both ends
Buffered transfer: the software divides data stream into segments independent of application program transfers
Full duplex connection: concurrent data transfer in both directions (half duplex possible, but then, no piggy-backing)
Copyright Rudra Dutta, NCSU, Spring, 2005 10
TCP Connections and EndpointsTCP uses destination port to identify ultimate destination
TCP uses connection as fundamental abstraction connection: a pair of endpoints endpoint: (host_IP_address, port #) (socket)
TCP ports a port # does not correspond to a single object
UDP: single queue per port
a TCP port number can be shared by multiple connections on the same machine (remote endpoints differ)
Copyright Rudra Dutta, NCSU, Spring, 2005 11
TCP Header FieldsSequence number every byte in data stream is numbered sequence number = number of the first data byte in the segment
in the sender's byte stream wraps around after 232 -1
ACK number next sequence number the sender of the acknowledgment
expects to receive i.e., sequence number + 1 of last successfully received
consecutive data byte valid only when ACK flag = 1
Header length length of header (in bytes) / 4 With maximum value of 15 (24-1), header cannot exceed 60 bytes
Copyright Rudra Dutta, NCSU, Spring, 2005 12
TCP Header FieldsWindow size number of bytes (starting with the one specified in the ACK
field) that receiver is willing to accept 16 bits long; max value is 65535 used for flow control
Checksum similar to UDP computation (including pseudo-header) but use is mandatory
Urgent pointer sequence number of last byte of urgent data (later) added to sequence number of segment
Options most common option is maximum segment size (MSS)
Copyright Rudra Dutta, NCSU, Spring, 2005 13
Description of Flags in Control Field
Flag Description
URG The value of the urgent pointer is valid
ACK The value of the acknowledgment number is valid
PSH Push the data (pass data to receiver as quickly as possible, later)
RST The connection must be reset
SYN Synchronize the sequence numbers during connection establishment
FIN The sender has no more data to transmit
Copyright Rudra Dutta, NCSU, Spring, 2005 15
End of Options
Used for padding at end of the option field
Only one end-of-option used --> after this option receiver looks for payload data
Copyright Rudra Dutta, NCSU, Spring, 2005 16
No Operation OptionIf multiple bytes needed for alignment, no-op option is used
Copyright Rudra Dutta, NCSU, Spring, 2005 17
Maximum Segment Size Option
Declared during connection establishment phase (in SYN segments)cannot be specified during data transfer
Defines the maximum segment size of data the sender is willing to accept
MSS must be interface MTU - 40 bytes default: 536
Copyright Rudra Dutta, NCSU, Spring, 2005 18
Long Fat Pipes (LFN)Bandwidth-delay product (capacity of "pipe") Bandwidth (b/s) X Round-trip-time (RTT, in s) Note: consider transmission and propagation delays
Example: trans-continental T3 RTT = 6000 Km * 5 s / Km = 30 ms B-D Product = 45 Mbps X 30 ms = 170 KB
TCP cannot handle “Long Fat Pipes” efficiently small window size (16 bits maximum of 64 KB) small sequence number space (go-back-N ARQ protocol)
Copyright Rudra Dutta, NCSU, Spring, 2005 19
Window Scale OptionNew window size = window size in header X 2scale-factor
equivalent to “shift window size in header left by a number of bits equal to the window scale factor”
only carried in segments with SYN flag = 1 (permanent for the connection)Limit on value (look it up)
Copyright Rudra Dutta, NCSU, Spring, 2005 20
Timestamp OptionSender puts timestamp option in segment, reflected by receiver in the acknowledgment Can compute RTT for each received ACK allows sender to compute RTT more accurately clock synchronization between two hosts is not required
Without timestamps, RTT calculated once every window OK for 8-segment windows but larger windows require better RTT calculations
Copyright Rudra Dutta, NCSU, Spring, 2005 22
Connection Establishment Issues
Naive approach:Connection Request messageConnection Accepted messageSequence numbers always start at 0
Problem: delayed duplicates
Copyright Rudra Dutta, NCSU, Spring, 2005 25
Connection Establishment Solution
Limit packet lifetime T = Max Segment Lifetime (compare with TTL)
Use long sequence numbers wrap around time > 2T = time for a packet and its ACKs to “die”
Choose a different initial sequence number (ISN) with each connection request
Ignore additional requests for connection after establishment
After host crash: wait for time 2T Allow time for old packets to die off
Copyright Rudra Dutta, NCSU, Spring, 2005 29
Simultaneous OpenA connects with B, B connects with A at same time (pass each other in the network) Only one connection will be established! Using only 2 ports (one on A, one on B)
Copyright Rudra Dutta, NCSU, Spring, 2005 30
Connection Termination IssuesAsymmetric (unilateral) release abrupt, data may be lost
Symmetric (bilateral) release each direction released independently of the other each side can close its transmission
“half closed” state is possible
To avoid data loss in a symmetric release… no side should disconnect until it is convinced that
the other side is also prepared to disconnect same as “two-army” problem: no solution!
Copyright Rudra Dutta, NCSU, Spring, 2005 31
TCP Connection Termination
1. A sender closes its part of the connection by sending a FIN segment
2. After ACKing the FIN, the receiver can still send data on its part of the connection (half-close)
3. Finally, the receiver closes its part of the connection by sending a FIN segment (graceful close)
Copyright Rudra Dutta, NCSU, Spring, 2005 33
Simultaneous Close
A sends FIN to B, B sends FIN to A at same time (pass each other in the network)
FIN_WAIT_1
CLOSING
TIME_WAIT
FIN_WAIT_1
CLOSING
TIME_WAIT
FIN J
FIN K
Ack K+1 Ack J+1
A B
Copyright Rudra Dutta, NCSU, Spring, 2005 34
Connection Reset
A connection can also be aborted with a RST segment (hard reset)normally reserved for error conditions, not
normal termination
Copyright Rudra Dutta, NCSU, Spring, 2005 35
Half Closed Connection
One end of connection (e.g. clientserver) terminates (sends FIN and receives ACK of FIN)
Other end (serverclient) remains open (sending data)
Other end (server) later terminates (sends FIN and receives ACK of FIN), and connection is then completely closed
Copyright Rudra Dutta, NCSU, Spring, 2005 36
TCP Connection StatesCLOSED
LISTEN
SYN_RCVD SYN_SENT
ESTABLISHED CLOSE_WAIT
LAST_ACKFIN_WAIT_1 CLOSING
FIN_WAIT_2 TIME_WAIT
Copyright Rudra Dutta, NCSU, Spring, 2005 37
TCP: Normal Client Open and Close
CLOSED
LISTEN
SYN_RCVD SYN_SENT
ESTABLISHED CLOSE_WAIT
LAST_ACKFIN_WAIT_1 CLOSING
FIN_WAIT_2 TIME_WAIT
Appl: active
openSend: SYN
Recv: SYN,ACK
Send: ACK
Appl: close
Send: FIN
Recv: ACK
Send: (nothing)
Recv: FIN
Send: ACK
2MSL timeout occurs
Send: (nothing)
Copyright Rudra Dutta, NCSU, Spring, 2005 38
TCP: Simultaneous OpenCLOSED
LISTEN
SYN_RCVD SYN_SENT
ESTABLISHED CLOSE_WAIT
LAST_ACKFIN_WAIT_1 CLOSING
FIN_WAIT_2 TIME_WAIT
Appl: active
openSend: SYN
Recv: SYN
Send: SYN,ACK
Appl: close
Send: FIN
Recv: ACK
Send: (nothing)
Recv: FIN
Send: ACK
2MSL timeout occurs
Send: (nothing)
Recv: ACKSend: (nothing)
Copyright Rudra Dutta, NCSU, Spring, 2005 39
TCP: Simultaneous CloseCLOSED
LISTEN
SYN_RCVD SYN_SENT
ESTABLISHED CLOSE_WAIT
LAST_ACKFIN_WAIT_1 CLOSING
FIN_WAIT_2 TIME_WAIT
Appl: active
openSend: SYN
Recv: SYN,ACK
Send: ACK
Appl: close
Send: FIN
Recv: FIN
Send: ACK
2MSL timeout occurs
Send: (nothing)
Recv: ACK
Send: (nothing)
Recv: FIN,ACKSend: ACK
Copyright Rudra Dutta, NCSU, Spring, 2005 40
TCP: Normal Server Open and Close
CLOSED
LISTEN
SYN_RCVD SYN_SENT
ESTABLISHED CLOSE_WAIT
LAST_ACKFIN_WAIT_1 CLOSING
FIN_WAIT_2 TIME_WAIT
Appl: passive open
Send: (nothing)
Recv: SYN
Send:SYN,ACK
Recv: ACKSend: (nothing)
Recv: FIN
Send: ACKAppl: close
Send: FIN
Recv: ACK
Send: (nothing)
Copyright Rudra Dutta, NCSU, Spring, 2005 41
TCP: Client Resets Connection Before Establishment Complete
CLOSED
LISTEN
SYN_RCVD SYN_SENT
ESTABLISHED CLOSE_WAIT
LAST_ACKFIN_WAIT_1 CLOSING
FIN_WAIT_2 TIME_WAIT
Appl: passive open
Send: (nothing)
Recv: SYN
Send:SYN,ACKRecv: RST
Send: (nothing)
Copyright Rudra Dutta, NCSU, Spring, 2005 44
Data Transfer in TCPData received from an application usually sent in segments of size MSS
ACKs carry the sequence number of the next byte receiver expects to receive this is a cumulative count
Unacknowledged data = data sent by sender, but not yet acknowledged by the receiver i.e., packets not yet received, or acknowledgment not yet
received
Each ACK carries a window advertisement window = how many additional bytes the receiver is prepared
to accept before the sender must wait for an acknowledgment size of window specified in bytes, not segments
Copyright Rudra Dutta, NCSU, Spring, 2005 45
Sliding Windows
TCP sliding window mechanism allows multiple segments to be sent before an ACK is returned
Left boundary of window = earliest unacknowledged bytean acknowledgment advances this left boundary
Right boundary of window = latest unacknowledged bytea window advertisement advances this right
boundary
Copyright Rudra Dutta, NCSU, Spring, 2005 49
Pushing Data + Urgent Data
Buffering by TCP sender improves efficiency & throughput not appropriate for all applications (why not?)
For interactive applications, can use push operation forces TCP to send data without waiting for the buffer to fill sets the PSH flag: forces delivery to application at receiving
end
There is no “interrupt” message to send out-of-band information (e.g., interrupt, abort signals) instead, set the URG flag urgent pointer points to the urgent data in the segment
Copyright Rudra Dutta, NCSU, Spring, 2005 50
Performance Issues
Small packets, and small window advertisements, create efficiency problems
These can solved bydelaying sending of data
sender “voluntarily” consolidates multiple small packets into a single larger packet
delaying sending of ACKs/window advertisements
sender “strongly encouraged” to consolidate multiple small packets into a single larger packet
Copyright Rudra Dutta, NCSU, Spring, 2005 51
“Silly Window” () SyndromeA serious problem in sliding window operation
Causes1. sending application program creates data slowly
2. receiving application program consumes data slowly
In either case, data may be sent in small segments inefficient use of bandwidth increased processing by TCP
TCP now requires avoidance heuristics
Copyright Rudra Dutta, NCSU, Spring, 2005 53
SWS Caused by Sender
Example: use of Telnet, running interactive editor at a remote hostpotentially, two 41-byte and two 40-byte
datagrams sent for each character typed Sender: send of 1 byte (1+40 bytes)Receiver: acknowledgment of 1 byte (40
bytes)Receiver: echo of 1 byte (1+40 bytes)Sender: acknowledgment of echo (40
bytes)
Copyright Rudra Dutta, NCSU, Spring, 2005 54
Sender Side HeuristicNagle's algorithm when application generates data 1 byte at a time, send
the first byte and buffer all the rest until… an ACK is received, or a maximum sized segment is filled, or half the window has filled (sent but not yet ack’ed)
The rule is applied even if Push has been requested
Features a self-clocking algorithm (does not compute delays) little overhead: no timers, clock readings, etc. does not affect throughput in “normal” use (i.e., when
there is lots of data to transmit)
Copyright Rudra Dutta, NCSU, Spring, 2005 55
Yet Another Override
Nagle’s algorithm is widely used in TCP implementations, but may have to be turned off in some cases ex.: mouse movements in an X-windows
application running over the InternetWhenever delay due to “consolidation” is
unacceptable
Copyright Rudra Dutta, NCSU, Spring, 2005 56
Receiver Side HeuristicsClark's solution do not send window advertisements for 1 byte instead, advertise a window size of zero and wait until
there is space for a maximum segment size of data, or the receive buffer is half-empty
whichever occurs earlier
Nagle's algorithm and Clark's solution are complementary and can be used together
Problems? Solutions? Consider the possibility of loss See persist timer
Copyright Rudra Dutta, NCSU, Spring, 2005 57
Receiver Side HeuristicsDelay window advertisements by delaying ACKs Until avoidance algorithm allows window advertisement May result in delaying ACK until there is data to send, so
can “piggyback” ACK with data May result in single ACK for multiple segments
Reduces traffic but may force sender to timeout and retransmit a
segment (incorrectly) ACKs may not be delayed for more than 500ms RTT estimation may suffer ACK at least every other segment
Copyright Rudra Dutta, NCSU, Spring, 2005 58
TCP Error Control
1. Error detection checksum: to check for corrupted
segments at destination ACK: to confirm receipt of segment by
destination time-out: one retransmission timer for
each segment sent
2. Error correction – no FEC source retransmits segments for which
retransmission timer expired
Copyright Rudra Dutta, NCSU, Spring, 2005 59
Responses to Errors
Fundamentally GoBackNDifferent versions of TCP have different
detail behavior
Corrupted or lost segmentswill not be acknowledged by receiverwill be retransmitted by sender
Duplicate segments ignored by receiver
Copyright Rudra Dutta, NCSU, Spring, 2005 60
Responses to Errors (cont’d)Out-of-order segments (receive segment n+1 before segment n)buffered by receiver to improve efficiencynot ACKed send a “duplicate” ACK for the last acknowledged segment
Lost ACKsmay lead to retransmission of datamost likely will not be noticed (another, later ACK also
indicates segment was received)
Combination of GoBackN, NACK, SRP
Copyright Rudra Dutta, NCSU, Spring, 2005 62
Types of TCP Timers
1. Retransmission
2. Persistence
3. Keep-Alive
4. Time-Waited
Copyright Rudra Dutta, NCSU, Spring, 2005 63
1. Retransmission Timer
A. When a segment is sent, a retransmission timer is started
B. If segment is ACKed before timer expires, the timer is stopped
C. If timer expires before ACK arrives, the segment is retransmitted and the timer is started again
Copyright Rudra Dutta, NCSU, Spring, 2005 64
Retransmission Timer(cont’d)How long should timeout interval be? too short: retransmitting when don’t need to! too long: waiting too long to retransmit when needed!
Terminology RTTsample = time measured for round trip of one
packet RTT = estimate of current round trip time RTO = retransmission timeout interval
Issues obtain accurate estimate of round-trip time choose a good value for retransmission timeout
interval
Copyright Rudra Dutta, NCSU, Spring, 2005 65
Measurement of Round-Trip Time
Basic RTT estimation explicitly measure the interval between the time a
segment is transmitted and the time the ACK for the segment returns
may use TCP timestamp option RTT = * RTT + (1-)* RTTsample
Recommended value for = .9
Karn's improvement do not update RTT for any segments that have been
retransmitted avoids ambiguity about which timer value to use
Copyright Rudra Dutta, NCSU, Spring, 2005 66
Determining RTO
Basic (original) methodRTO = 2 * RTT (i.e. allow two roundtrip
times before timing out)
ProblemWhen RTT has a high standard deviation,
this method is too conservative (times out too quickly)
Copyright Rudra Dutta, NCSU, Spring, 2005 67
Determining RTO (cont’d)Solution estimate both mean and standard deviation of RTT set RTO = to RTT + a multiple of the standard deviation use mean deviation (MDEV)as a cheap estimate of standard
deviation
Algorithm
Comments 99% of packets arrive within 4 standard deviations The approximations above are extremely efficient computationally
RTT = RTT + *(RTTsample – RTT)
MDEV = * MDEV + (1-) * RTTsample - RTT
RTO = RTT + 4 * MDEV
( = .75, = .125)
Copyright Rudra Dutta, NCSU, Spring, 2005 68
TCP Coarse-Grain TimerTCP uses a 500-ms clock for time measurements values obtained may differ up to 500 ms from actual
value`
Copyright Rudra Dutta, NCSU, Spring, 2005 69
Retransmission Timer BackoffNot updating the RTT for lost / retransmitted segments may lead to trouble
Retransmission may have occurred because RTO is too low E.g. maybe network suddenly became overloaded should therefore increase RTO
Exponential backoff upon timeout, set RTO = 2 * RTO continue doubling of RTO for each retransmission when an ACK arrives for a segment that did not
require retransmission, update RTT and RTO according to the “normal” computation
Copyright Rudra Dutta, NCSU, Spring, 2005 70
2. Persist Timer
It is possible for the advertised window to go to 0 i.e., sender is transmitting faster than
receiver can receive
The receiver can request more data by sending an ACK advertising the size of the available receive buffer
If this ACK is lost deadlock!
Copyright Rudra Dutta, NCSU, Spring, 2005 71
2. Persist Timer (cont’d)
After receiving any ACK with window_size = 0, start a persist timerUpon expiration of the persist timer, send a
probe packetProbe packet = 1 byte of data
The response (ACK) to the probe gives the window size recovers from lost window advertisements
by “stimulating” new advertisements
Copyright Rudra Dutta, NCSU, Spring, 2005 73
3. Keepalive Timer
It is possible for TCP connections to be idle for a long timee.g., client opens a connection to a server,
but never makes a requestconsumes resources (memory) on the
server
Keepalive timer maintained by some implementationsnot part of TCP standardsomewhat controversial
Copyright Rudra Dutta, NCSU, Spring, 2005 74
3. Keepalive Timer (cont’d)
The timer is reset every time a segment is received timer interval = 2 hours
When timer expires, check if other side is still there repeat 10 times at 75 second intervals if there is no response, the connection is
terminated
Copyright Rudra Dutta, NCSU, Spring, 2005 75
4. Time-Wait TimerAfter agreeing with the other side to close a connection, TCP enters the TIME_WAIT state starts a timer that runs for twice the maximum
packet lifetime the connection is held in limbo during this period the connection is closed after the timer expires
This ensures that all packets (and their ACKs) created during the connection have “died off” (reached their destination or been discarded)
Copyright Rudra Dutta, NCSU, Spring, 2005 77
Congestion Control: “Open Loop” Approach
1. Source describes the traffic it will be sending through the network problem: choosing right set of parameters to
describe a source
2. The network reserves resources (bandwidth, buffer space) during connection establishment resources will be held by the connection for the
duration
3. Source “shapes” its traffic to match its description
Copyright Rudra Dutta, NCSU, Spring, 2005 78
Open Loop (cont’d)
Results network overload / congestion will be
avoided connection performance is guaranteed
Significant amount of work in this area ATM standards
Copyright Rudra Dutta, NCSU, Spring, 2005 79
Congestion Control: “Closed Loop” Approach
1. Source establishes connection, no traffic description required
2. Feedback about congestion is continually provided
3. Source dynamically adapts its traffic characteristics to reduce congestion or use available bandwidth source must decide how to best react to current
network state not policed or enforced by the network!
Copyright Rudra Dutta, NCSU, Spring, 2005 80
“Closed Loop” Approach (cont’d)Results congestion is detected and then reduced no guarantees of connection performance
How does the source learn that its service rate has changed? Choices:1. explicit feedback: the network conveys this info to
the source example: ICMP source quench
2. implicit feedback: source infers a change in its service rate by measuring its current end-to-end performance example: TCP acknowledgments
Copyright Rudra Dutta, NCSU, Spring, 2005 81
TCP Congestion Control
1. Uses a closed-loop approach
2. Feedback is implicit (ACKs)
3. Adapts rate based on dynamic window smaller window = lower rate, larger
window = higher rate
4. Control mechanism is end-to-end no connection state maintained by the
network!
Copyright Rudra Dutta, NCSU, Spring, 2005 82
TCP Implicit Feedback
TCP relies on feedback from the network to detect congestion timeout caused by a lost packet (used by
TCP-Tahoe, TCP-Reno)duplicate ACK (used by TCP-Reno)
Assumption: packet loss is due to congested routers, not transmission errors Incorrect assumption for wireless networks!
Copyright Rudra Dutta, NCSU, Spring, 2005 83
TCP Dynamic Windows
Window #1: advertised by receiverpurpose: avoid overrunning a slow receiver (i.e.,
for flow control)
Window #2: maintained by senderpurpose: avoid network overload (i.e. for
congestion control) Called the congestion window, or cwnd the sending TCP dynamically manipulates cwnd
“The” window size = MIN(receiver_advertisement, cwnd)
Copyright Rudra Dutta, NCSU, Spring, 2005 84
TCP Congestion Control Overview1. Probe for available bandwidth by increasing cwnd
“slow” start = exponential increase phase congestion avoidance = linear increase phase
2. Slow start threshold (= ssthresh) controls transition from exponential to linear phase
3. Upon packet loss (timeout, duplicate acknowledgment)… retransmit the packet reduce the window size
TCP Reno adds (1) fast retransmit, and (2) fast recovery
Copyright Rudra Dutta, NCSU, Spring, 2005 85
Slow StartAt connection establishmentcwnd 1, ssthresh 1/2 initial receiver window
advertisement (both cwnd and ssthresh are in units of MSS in
the following)
Upon arrival of each ACK: cwnd cwnd + 1
Not slow; exponential increase!window increases by cwnd every RTTcan quickly congest the network and cause packet
loss
Copyright Rudra Dutta, NCSU, Spring, 2005 87
Congestion AvoidanceBasic idea: avoid increasing the window size too quickly slow down when approaching the “steady state”
Use slow start as long as cwnd ssthresh, enter congestion avoidance phase when cwnd > ssthresh
Upon arrival of each ACK: cwnd cwnd + 1/cwndLinear increase window increases by 1 every RTT
Copyright Rudra Dutta, NCSU, Spring, 2005 88
Response to Congestion (TCP Tahoe/Reno)
When congestion occurs Packet deliver may be slowed, or fail Acknowledgment delivery may be slowed, or fail In either case, timeout / retransmission may occur
transmission_window = MIN(receiver_advertisement,cwnd) ssthresh MAX(2, ½* transmission_window)
When timeout occurs… cwnd 1 (enter slow start phase, from the top)
Copyright Rudra Dutta, NCSU, Spring, 2005 90
“Slow Start”, “Exponential Decay”Consider only steady state conditions
Increase of transmission windowCongestion avoidance in the steady stateNot slow startWindow increases by 1 each RTTAdditive increase
Decrease of transmission windowOnly congestion, in steady stateSuccessive timeoutsWindow halved in each RTTMultiplicative decrease
Copyright Rudra Dutta, NCSU, Spring, 2005 91
How Good is This?On the whole a simple strategy Logic not obvious
Consider the fairness issues
Two TCP connections over a bottleneck link Ideally, should share bandwidth But start times and windows could be very different
Copyright Rudra Dutta, NCSU, Spring, 2005 92
How Good is This?Together, the connections cannot get more than bottleneck bandwidth And should get that much
We would like them to be fair and share equally
Consider a situation when this is not true
Successive adjustments tend to stabilize
Copyright Rudra Dutta, NCSU, Spring, 2005 93
Fast Retransmission (TCP Reno)When out-of-order segment arrives receiver issues ACK with the same cumulative sequence
number as the previous ACK called a duplicate ACK (acts like a NAK) Indicates data is still flowing between sender and receiver
Receipt of 3 duplicate ACKs in a row for segment X is a strong indication that segment X has been lost
Sender actions retransmit the (presumably) lost segment without waiting for
its timer to expire, and no delay ssthresh ½ * transmission_window But cwnd is not set to 1 called fast retransmission
Copyright Rudra Dutta, NCSU, Spring, 2005 94
Fast Recovery (TCP Reno)Receipt of duplicate ACKs implies data is still flowing: do not enter slow start yet
After a fast retransmission cwnd ssthresh + 3 fast recovery, enter linear phase
Each time another duplicate ACK arrives cwnd cwnd + 1
When ACK for a retransmitted segment arrives cwnd ssthresh Back to normal
Copyright Rudra Dutta, NCSU, Spring, 2005 95
Congestion ExampleSource/receiver have buffers of 8192 bytes
MSS = 1024
Copyright Rudra Dutta, NCSU, Spring, 2005 99
TCP and Transactions (Request / Reply)
Many TCP transactions consist simply of a request to a server and a reply to the client
Problems these simple transactions may require the
exchange of 10 segments (why?)a simple transaction also takes at least
two RTT times plus the processing time at the server
Copyright Rudra Dutta, NCSU, Spring, 2005 100
TCP Extension for Transactions (cont'd)
To distinguish between consecutive transactions, a connection count (CC) TCP option is added to the headerThe server maintains a per-host cache of the most recent CC MSS window size RTO
The client combines the following into one segment SYN Request FIN also includes the current CC value
Copyright Rudra Dutta, NCSU, Spring, 2005 101
If the received CC > the cached CC, the server combines the following into a single segment
SYN ACK of the client’s SYN reply FIN
Else, the normal 3-way handshake is performed
Client ACKs the server's (SYN + FIN + data) in one segment
Time needed is the minimum 1 RTT + processing time at server
TCP Extension for Transactions (cont'd)