Chapter 6 Congestion Control and Resource Allocation Copyright © 2010, Elsevier Inc. All rights Reserved
Dec 21, 2015
Chapter 6
Congestion Control and Resource Allocation
Copyright © 2010, Elsevier Inc. All rights Reserved
2
Chapter 6Problem• We have seen enough layers of protocol hierarchy to
understand how data can be transferred among processes across heterogeneous networks
• Problem– How to effectively and fairly allocate resources among a
collection of competing users?
3
Chapter 6Congestion Control and Resource Allocation
• Resources– Bandwidth of the links– Buffers at the routers and switches
• What is congestion?– “too many sources sending too much data too fast for network
to handle”– Higher rate of inputs to a router than outputs– Packets compete at a router for the use of a link– Each competing packet is placed in a queue waiting for its turn
to be transmitted over the link
4
Chapter 6Congestion Control and Resource Allocation• When too many packets are contending for the same
link– Queuing/Buffering at routers
• Long Delays
– The buffer overflows at routers• Dropped packets• Result in lost packets
• Network should provide a congestion control mechanism to deal with such a situation
5
Chapter 6Congestion Control vs Flow Control• Congestion control is a global issue
– involves every router and host within the subnet
• Flow control – scope is point-to-point– involves just sender and receiver – How did TCP implement this?
6
Chapter 6Congestion Control and Resource Allocation
• Congestion control and Resource Allocation– Two sides of the same coin
• If the network takes active role in allocating resources– The congestion may be avoided– No need for congestion control
7
Chapter 6Congestion Control and Resource Allocation• Allocating resources with any precision is difficult
– Resources are distributed throughout the network
• On the other hand, we can always let the sources send as much data as they want– Then recover from the congestion when it occurs– Easier approach – But it can be disruptive because many packets may be
discarded by the network before congestions can be controlled
8
Chapter 6Question• Congestion control and resource allocations involve
both hosts and network elements such as routers
• What can a host do to control congestion?
• What can a router do to control congestion?
9
Chapter 6Answer• Congestion control and resource allocations involve
both hosts and network elements such as routers
• In network elements (Routers)– Various queuing disciplines can be used to control:1. the order in which packets get transmitted and 2. which packets get dropped
• At the hosts’ end– The congestion control mechanism paces how fast sources
are allowed to send packets
10
Chapter 6Issues in Resource Allocation– Packet Switched Network (internet)
• Resource allocation consists of multiple links and switches/routers.
• Congested intermediate link. • In such an environment, a given source may have
more than enough capacity on the immediate outgoing link to send a packet.
• Bottleneck router: But somewhere in the middle of a network, its packets encounter a link that is being used by many different traffic sources
11
Chapter 6Question• What are the causes of a bottleneck router?
12
Chapter 6Aswer1. Amount of incoming data is too large AND2. Speed of data arrival at router too high
- Why?
13
Chapter 6Issues in Resource Allocation• Network Model
– Packet Switched Network
A potential bottleneck router.
14
Chapter 6Issues in Resource Allocation• Connectionless Flows
• For much of our discussion, we assume that the network is essentially connectionless
• Any connection-oriented service needed is implemented in the transport protocol that is running on the end hosts.
• Why this assumption?
15
Chapter 6Issues in Resource Allocation• Why this assumption?–Because, our original assumptions were too
restrictive/strong:
1. Connection-less vs connection-oriented networks – There IS a grey area in between
2. Datagrams are completely independent in connectionless network– Yes, datagrams ARE certainly switched independently– But usually, stream of datagrams between a particular pair of hosts
flows through a particular set of routers
16
Chapter 6Issues in Resource Allocation• Network Model
– Connectionless Flows
Multiple flows passing through a set of routers
17
Chapter 6Issues in Resource Allocation• Multiple related packets flow through each router
• Routers maintain soft state (some state of information) for each flow that goes through it
• Soft state is refreshed by periodic messages or is otherwise expired.
• Soft State (e.g. PIM)Vs Hard State Information (e.g RIP)
• Soft state of a flow can be used to make resource allocation decisions about the packets that belong to the flow.
18
Chapter 6Issues in Resource Allocation• Soft state represents a middle ground between :
– a purely connectionless network that maintains no state at the routers and – a purely connection-oriented network that maintains hard state at the
routers.
• In general, the correct operation of the network does not depend on soft state being present
– Each packet is still routed correctly without regard to this state
• But, when a packet happens to belong to a flow for which the router is currently maintaining soft state, then the router is better able to handle the packet.
19
Chapter 6Taxonomy• There are countless ways in which resource allocation
mechanisms differ
• We will discuss 3 dimensions along which resource allocation mechanisms can be characterized:1. Router-centric vs Host-centric2. Reservation based vs Feedback based3. Window based vs Rate based
20
Chapter 6Issues in Resource Allocation• Router-centric vs. Host-centric
1. Router-centric design:• Each router takes responsibility for :
– deciding when packets are forwarded and – selecting which packets are to dropped, as well as – informing the hosts that are generating the network traffic how
many packets hosts are allowed to send.
2. Host-centric design:• End hosts:
– observe the network conditions » e.g. observe how many packets they are successfully getting through
the network– Adjust their behavior accordingly
– Note that these two groups are not mutually exclusive.
21
Chapter 6Issues in Resource Allocation• Reservation-based vs. Feedback-based
1. Reservation-based system :• some entity (e.g., the end host) asks the network for a certain amount of
capacity to be allocated for a flow. • Each router then allocates enough resources (buffers and/or percentage
of the link’s bandwidth) to satisfy this request. • If the request cannot be satisfied at some router, because doing so
would overcommit its resources, then the router rejects the reservation.
2. Feedback-based approach:• the end hosts begin sending data without first reserving any capacity
and • End hosts then adjust their sending rate according to the feedback they
receive. • EXPLICIT feedback - Router sends message ’slow down’• IMPLICIT feedback - end host adjusts sending rate subject to network
conditions e.g. packet loss
22
Chapter 6Issues in Resource Allocation• Window-based versus Rate-based
– Window-based:• Receiver advertises a window to the sender
(window advertisement)
– Rate-based• Receiver control sender’s behavior using a rate:
– how many bit per second the receiver or network is able to absorb.
• E.g. multimedia streaming application
23
Chapter 6Evaluating Resource Allocation Schemes• Resource allocation schemes can be evaluated
based on:– Effectiveness– Fairness
24
Chapter 6Evaluate Effectiveness of Resource Allocation
• Consider the two principal metrics of networking:– throughput and – delay.
• Clearly, we want as much throughput and as little delay as possible.
• Unfortunately, these goals are often somewhat at odds with each other.
25
Chapter 6Evaluate Effectiveness of Resource Allocation
• Increase throughput:– Idle link hurts throughput.– So, allow as many packets into the network as possible, – Goal: drive the utilization of all the links up to 100%.
• Problem with this strategy:– increasing the number of packets in the network also
increases the length of the queues at each router. – Longer queues, in turn, mean packets are delayed longer in
the network
26
Chapter 6Evaluate Effectiveness of Resource Allocation
– Throughput and Delay Relationship:• Described using the throughput to delay ratio.
– Used as a metric for evaluating the effectiveness of a resource allocation scheme.
• This ratio is sometimes referred to as the power of the network.
• Power = Throughput/Delay
27
Chapter 6Evaluate Effectiveness of Resource Allocation
Ratio of throughput to delay as a function of load
28
Chapter 6Fair Resource Allocation• What exactly constitutes fair resource allocation?
• For example, a reservation-based resource allocation scheme provides an explicit way to create controlled unfairness.
• With such a scheme, we might use reservations to enable a video stream to receive 1 Mbps across some link while a file transfer receives only 10 Kbps over the same link.
29
Chapter 6Fair Resource Allocation• When several flows share a particular link, we would
like for each flow to receive an equal share of the bandwidth.
• This definition presumes that a fair share of bandwidth means an equal share of bandwidth.
• But equal shares may not equate to fair shares. – Why?
30
Chapter 6Fair Resource Allocation• Should we also consider the length of the paths being
compared?– Consider the figure:
• What does “Fair” mean?– Equal share of resources for all flows?– Proportional to how much you pay for service?– Should we take route length into account?
One four-hop flow competing with three one-hop flows
31
Chapter 6Fair Resource Allocation• Raj Jain’s fairness index:• Metric to quantify the fairness of a congestion-control
mechanism. • Assume: fair implies equal, all paths are of equal length• Definition:
– Given a set of flow throughputs (x1, x2, . . . , xn), the following function assigns a fairness index to the flows:
– The fairness index always results in a number between 0 and 1, with 1 representing greatest fairness.
32
Chapter 6
32
Queuing Disciplines• Each router must implement some queuing discipline
– Scheduling policy– Drop policy
• Queuing allocate bandwidth and buffer space:– Bandwidth: which packet to serve (transmit) next – Buffer space: which packet to drop next (when required)
• Queuing also affects latency
33
Chapter 6Queuing Disciplines1. FIFO Queuing2. Priority Queuing3. Fair Queuing
34
Chapter 6FIFO Queuing• FIFO queuing:
• Scheduling disciplines• First-Come-First-Served (FCFS) queuing• The first packet that arrives at a router transmitted first
• Tail drop:• Drop policy• If a packet arrives and the queue (buffer space) is full,
then the router discards that packet regardless of flow or importance
35
Chapter 6Queuing Disciplines
(a) FIFO queuing
(b) tail drop at a FIFO queue
36
Chapter 6Question• What are the problems of FIFO scheduling?
• How suitable is FIFO queuing for sending both Voice over IP and Emails?
37
Chapter 6Answer• FIFO: First-in first-out scheduling
– Simple, but restrictive
• Voice over IP and Email:– Two kinds of traffic (different flows)– Their transmission requirements differ:
• Voice over IP needs low delay• E-mail can tolerate delay
• FIFO queue treats all packets the same:– If voice traffic comes after email traffic, VOIP traffic waits
behind e-mail in FIFO queue (not acceptable!)
FIFO Queue
38
Chapter 6
38
• Lock-out problem– Few flows can monopolize the queue space
• A flow sends more Fill most of FIFO queue
• Full queues– Occurs if routers’ queues are often full.– TCP detects congestion from loss– Forces network to have long standing queues in steady-state– Queuing delays – bad for time sensitive traffic
FIFO + Drop-tail Problems
39
Chapter 6Priority Queuing• A simple variation on basic FIFO queuing. • Mark each packet with a priority• The routers implement multiple FIFO queues, one per priority class. • Always transmit high-priority traffic when present• Possible starvation
– http://www.cisco.com/c/en/us/td/docs/ios/12_2sb/feature/guide/mpq.html
39
Priority 1 Queue
Priority 2 Queue
Priority 3 Queue
40
Chapter 6Fair Queuing• Solves the main problem with FIFO queuing:
• FIFO queuing does not discriminate between different traffic sources/ flows
• Fair queuing (FQ) algorithm address this problem.
• Maintain a separate queue for each flow currently being handled by the router.
• The router then services these queues in a round-robin fashion
41
Chapter 6Fair Queuing
Round-robin service of four flows at a router
42
Chapter 6Question• Does fair queuing actually do a fair queuing?
43
Chapter 6Answer• The main complication with Fair Queuing:
– Packets lengths vary: packets being processed at a router are not necessarily the same length.
• To truly allocate the bandwidth of the outgoing link in a fair manner, it is necessary to take packet length into consideration. – E.g. A router is managing two flows
• one with 1000-byte packets and the other with 500-byte packets• If you use round-robin for processing of packets:• the first flow 2/3 of the link’s bandwidth and • the second flow only 1/3 of its bandwidth.• Unfair!
44
Chapter 6Bit-by-bit round-robin• What we really want is bit-by-bit round-robin
• Bit-by-bit round-robin:– Router transmits: a bit from flow 1, then a bit from flow 2,
and so on.
– Clearly, it is not feasible to interleave the bits from different packets.
– The FQ mechanism therefore simulates this behavior:1.Determine finishing-time of transmitting a packet if it used bit-by-
bit round-robin2.Use this finishing time to sequence the packets for transmission.
45
Chapter 6Queuing Disciplines• Bit-by-bit round robin:
– To understand the algorithm for approximating bit-by-bit round robin, consider the behavior of a single flow
– For this flow, let• Pi : packet length: denote the length of packet i
• Si: start time: time when the router starts to transmit packet i
• Fi: finish time: time when router finishes transmitting packet I
• Fi = Si + Pi
– Assume Pi is expressed in time to transmit packet – (1bit 1 second)
46
Chapter 6Queuing Disciplines• When do we start transmitting packet i?
– Depends on whether packet i arrived before or after the router finishes transmitting packet i-1 for the flow
• If packet i arrived before router finished transmitting packet i-1:
– First bit of packet i transmitted immediately after last bit of packet i-1
• If packet i arrived after router finished transmitting packet i-1:
– i.e. router finished sending packet i-1 long before packet i arrived
– i.e. for sometime the queue for this packet is empty
47
Chapter 6Queuing Disciplines
– Let Ai denote the time that packet i arrives at the router
– Then Si = max(Fi-1, Ai)
– Fi = max(Fi-1, Ai) + Pi
• Now for every flow, we calculate Fi for each packet that arrives using our formula
• We then treat all the Fi as timestamps
• Next packet to transmit is always the packet that has the lowest timestamp– The packet that should finish transmission before all others– i.e. a shorter packet arriving can be inserted in front of a longer packet
already in queue
48
Chapter 6Queuing Disciplines• E.g. Fair Queuing
(a) packets with earlier finishing times are sent first
(b) sending of a packet already in progress is completed
Both packets in flow1 have earlier finishing times than packet in flow2
Router already begun sending packet from flow2 when a packet from flow 1 comes
Implementation does not preempt already processing packet (So, not exactly bit-by-bit fair queuing)
49
Chapter 6TCP Congestion Control• TCP congestion control was introduced into the
Internet in the late 1980s by Van Jacobson, roughly eight years after the TCP/IP protocol stack had become operational.
• Immediately preceding this time, the Internet was suffering from congestion collapse—– hosts send their packets into the Internet as fast as the
advertised window would allow– congestion would occur at some router (causing packets to
be dropped), – hosts would time out and retransmit their packets, resulting
in even more congestion
50
Chapter 6TCP Congestion Control• The idea of TCP congestion control:
– Each source to determine how much capacity is available in the network, so that it knows how many packets it can safely have in transit.
– Once a given source has this many packets in transit, it uses the arrival of an ACK as a signal that one of its packets has left the network, and that it is therefore safe to insert a new packet into the network without adding to the level of congestion.
– By using ACKs to pace the transmission of packets, TCP is said to be self-clocking.
51
Chapter 6TCP Congestion Control• There a quite a few TCP congestion control variants in
use today:
1. Additive Increase/ Multiplicative Decrease (AIMD)
2. Slow Start
3. Fast retransmit and Fast Recovery
52
Chapter 6Additive Increase Multiplicative Decrease (AIMD)
• Congestion Window:– TCP maintains a new variable called congestion window per
connection– Source use this to limit amount of allowed data in transit at a
given time.
• The congestion window is congestion control’s counterpart to flow control’s advertised window.
• The maximum number of bytes of unacknowledged data allowed = Minimum( congestion Window, advertised Window)
53
Chapter 6Additive Increase Multiplicative Decrease• TCP’s effective window is revised as follows:
– MaxWindow = MIN(CongestionWindow, AdvertisedWindow)– EffectiveWindow = MaxWindow − (LastByteSent − LastByteAcked).
– Remember from previous: LastByteSent – LastByteAcked <= AdvertisedWindow
• That is, MaxWindow replaces AdvertisedWindow in the calculation of EffectiveWindow.
• Thus, a TCP source is allowed to send no faster than the slowest component—the network or the destination host—can accommodate.
54
Chapter 6
• How does TCP comes to learn an appropriate value for CongestionWindow?– Sender needs to know both congestion window and
advertised window.– AdvertisedWindow is sent by the receiver– There is no one to send a suitable
CongestionWindow to sender!
Additive Increase Multiplicative Decrease
55
Chapter 6Additive Increase Multiplicative Decrease• Answer:
– TCP sender sets the CongestionWindow based on the level of congestion it perceives to exist in the network.
– Sender decrease the congestion window when the level of congestion goes up
– Sender increase the congestion window when the level of congestion goes down.
– Taken together, the mechanism is commonly called additive increase/multiplicative decrease (AIMD)
56
Chapter 6Additive Increase Multiplicative Decrease
• How does the source determine that the network is congested and that it should decrease the congestion window?
57
Chapter 6Additive Increase Multiplicative Decrease• Answer: Multiplicative Decrease
• The main reason for un-delivered packets, and timeout occurrences, is that a packet was dropped due to congestion.
• Therefore, TCP interprets timeouts as a sign of congestion and reduces it’s current transmission rate.
• Specifically, each time a timeout occurs, the source sets CongestionWindow to half of its previous value.
• Repetition of this cycle gives the classic sawtooth pattern
58
Chapter 6Additive Increase Multiplicative Decrease
59
Chapter 6Additive Increase Multiplicative Decrease• Although CongestionWindow is defined in terms of
bytes, it is easiest to understand multiplicative decrease if we think in terms of whole packets. – E.g.– suppose current CongestionWindow = 16 packets. – If a loss is detected, CongestionWindow is set to 8. – Additional losses cause CongestionWindow to be reduced to
4, then 2, and finally to 1 packet.
– CongestionWindow is not allowed to fall below the MSS: i.e. size of a single packet.
60
Chapter 6Additive Increase Multiplicative Decrease
• Additive Increase:– Every time the source successfully sends a
CongestionWindow’s worth of packets—that is, each packet sent out during the last RTT has been ACKed—it adds the equivalent of 1 packet to CongestionWindow.
– i.e. Increment CongestionWindow by one packet per RTT (until a packet loss occur, in which case AIMD enters MD mode)
61
Chapter 6Additive Increase Multiplicative Decrease
Packets in transit during additive increase, with one packet being added each RTT.
62
Chapter 6
• Each time an ACK arrives, the congestion window is incremented as follows :– Increment = MSS × (MSS/CongestionWindow)– Congestion Window+= Increment
– CongestionWindow>=MSS
– i.e, rather than incrementing CongestionWindow by an entire MSS bytes each RTT, we increment it by a fraction of MSS every time an ACK is received.
Additive Increase Multiplicative Decrease
63
Chapter 6Question• Why does TCP decrease congestion window
aggressively and increase it conservatively?
64
Chapter 6Answer• One intuitive reason is that the consequences of having
too large a window are much worse than those of it being too small.
• For example, when the window is too large, packets that are dropped will be retransmitted, resulting in congestion
• Thus, it is important to get out of this state quickly.
65
Chapter 6Additive Increase Multiplicative Decrease
• Problem:– AIMD is good for channels operating close to network capacity– But AIMD takes a long time to ramp up to full capacity when it
has to start from scratch– Why? AIMD increase congestion window linearly.
• Solution:– Use Slow Start to increase window rapidly from a cold start
66
Chapter 6Slow Start• Another congestion control mechanism
• Increase the congestion window rapidly from a cold start.
• Effectively increases the congestion window exponentially, rather than linearly.
67
Chapter 6• Start:
– Set Congestion window = 1 packet– Send 1 packet
• Upon receipt of ACK:– Set Congestion window = 1+1 = 2 packets– Send 2 packets
• Upon receipt of 2 ACKs:– Set Congestion window = 2+2 = 4 packets– Send 4 packets
• The end result is that TCP effectively doubles the number of packets it has in transit every RTT.
Slow Start
68
Chapter 6
Slow start exponential increase in sender window
Additive increase linear increase in sender window
69
Chapter 6
• There are two situations in which slow start runs: 1. At the very beginning of a connection (Cold Start)2. When the connection goes dead while waiting for a
timeout to occur.
Slow Start
70
Chapter 6
1. At the very beginning of a connection (Cold Start):
• Source doesn’t know how many packets it is going to have in transit at a given time.
– TCP runs over everything from 9600bps link to 2.4Gbps link
• Slow start continues to double CongestionWindow each RTT until there is a loss
• When a loss is detected, timeout causes multiplicative decrease to divide CongestionWindow by 2.
Slow Start
71
Chapter 62.When the connection goes dead while waiting
for a timeout to occur. • Original TCP Sliding Window:
– Sender sent advertised window of data already– a packet gets lost– Source waits for ACK and eventually, a timeouts and retransmit
lost packet.– Source gets a single cumulative ACK that reopens an entire
advertised window
• Slow Start:– The source then uses slow start to restart the flow of data rather
than dumping a whole window’s worth of data on the network all at once.
Slow Start
72
Chapter 6
• Target congestion window:
– After detecting packet loss:
• Update congestion window =
CongestionWindow prior to the last packet loss/2
• This is called target congestion widow compared to current congestion window (actual congestion window)
Slow Start
73
Chapter 6– Now we need to remember both:
• “target” congestion window:– resulting from multiplicative decrease– Called Congestion Threshold (or simply threshold)
• “actual” congestion window:– used by slow start. – Simply called Congestion Window (CW).
Slow Start
74
Chapter 6
– How to update current/actual CW?• Upon packet loss calculate threshold= CW/2
• Reset actual CW = 1
• Use Slow start to rapidly increase the sending rate up to target CW, and then use additive increase beyond this point.
– Target congestion window is where Slow start ends and AIMD begins.
75
Chapter 6
Behavior of TCP congestion control.
Colored line = value of CongestionWindow over time; solid bullets at top of graph = timeouts; hash marks at top of graph = time when each packet is transmitted; vertical bars = time when a packet that was eventually retransmitted was first transmitted.
Slow Start
76
Chapter 6
Behavior of TCP congestion control.
Colored line = value of CongestionWindow over time; solid bullets at top of graph = timeouts; hash marks at top of graph = time when each packet is transmitted; vertical bars = time when a packet that was eventually retransmitted was first transmitted.
Slow Start
Slow starts
Additive increase
77
Chapter 6Slow Start
• Initial slow start:– Rapid increase due to exponential increase in sender window
Initial slow start
78
Chapter 6Slow Start
Packets lost
• Why?– TCP attempts to learn network bandwidth– It uses exponential growth of congestion window to do this.– Source runs the risk of having ½ window worth packets dropped by
network.• E.g. network capacity = 16 packets• Source sends 1, 2,4,8,16 successfully so then bumps up congestion window to 32 • 32-16 = 16 packets are dropped (worst case – some packets might be buffered at some router)
79
Chapter 6Slow Start
Congestion window flattens
• Why?– No new packets are sent (notice no hash marks)– Because several packets were lost.– So no ACKs arrive at sender
80
Chapter 6Slow Start
Timeout occurs
• Timeout occurs.– Congestion window is divided by 2 : ~34KB ~17KB– Set CongestionThreshold = ~17KB– Reset CongestionWindow=1KB– Use exponential increase in CongestionWindow to arrive at
CongestionThreshold – Use Additive increase afterwards (no apparent in figure)
~34~34/2 17
81
Chapter 6Slow Start
– Congestion window is reset to 1 packet– It starts ramping up from there using slow start.– Then use Additive increase afterwards
Congestion window reset to 1 packetSlow start runs
Packets lost Timeout occurs
Additive increase runs
82
Chapter 6Fast Retransmit and Fast Recovery• Coarse-grained implementation of TCP timeouts led to long
periods of time during which the connection went dead while waiting for a timer to expire.
• To solve this, a new mechanism called fast retransmit was added to TCP.
• Fast retransmit is a heuristic that sometimes triggers the retransmission of a dropped packet sooner than the regular timeout mechanism.
no packets
sent
no packets
sent
no packets
sent
83
Chapter 6
• Fast Retransmit– Receiver sends a duplicate ACK:
• Receiver gets data
• Receiver responds with an acknowledgment, even if this sequence number has already been acknowledged (duplicate ACK).
Fast Retransmit and Fast Recovery
84
Chapter 6
• Out of order packet receipt:
–TCP cannot yet acknowledge the data the packet contains because earlier data has not yet arrived
–TCP resends the same acknowledgment (duplicate ACK) it sent the last time.
–Sender receives a duplicate ACK
–So, sender interprets this as receiver got a packet out of order
85
Chapter 6Question• What are the reasons for receiving packets out-
of-order?
86
Chapter 6Fast Retransmit and Fast Recovery• Answer:
– Earlier packet might have been lost or delayed
• Question:– How does the sender make sure it is actually a lost
packet, not a delayed packet?
87
Chapter 6Fast Retransmit and Fast Recovery
– Sender waits until it sees some number of duplicate ACKs (to make sure it is not a delayed packet but a lost packet)
– Why does receiver send multiple duplicate ACKs?
88
Chapter 6Fast Retransmit and Fast Recovery
– For every single packet sent after the missing packet, receiver will send a duplicate ACK
– Sender then retransmits the missing packet.
– In practice, TCP waits until it has seen three duplicate ACKs before retransmitting the packet.
89
Chapter 6Fast Retransmit and Fast Recovery• Figure illustrates how duplicate ACKs lead to a fast
retransmit:
the destination receives packets 1 and 2, but packet 3 is lost in the network. destination will
send a duplicate ACK for packet 2 when packet 4 arrives, again when packet 5 arrives, and so on
To simplify this example, we think in terms of packets 1, 2, 3, and so on, rather than worrying about the sequence numbers for each byte.
Sender sees the 3rd duplicate ACK for packet2, and retransmit packet 3
Sender sends a cumulative ACK for everything up to and including packet 6
90
Chapter 6Fast Retransmit and Fast Recovery• Fast Retransmit and Fast Recovery
Trace of TCP with fast retransmit. Colored line = CongestionWindow;solid bullet = timeout; hash marks = time when each packet is transmitted; vertical bars = time when a packet that was eventually retransmitted was first transmitted.
91
Chapter 6Fast Retransmit and Fast Recovery
no packets
sent
no packets
sent
Fast Retransmit
No fast retransmit
92
Chapter 6Fast Retransmit and Fast Recovery• This improves throughput by ~20%
• However this does not completely eliminate all coarse-grained timeouts.
– Why?
93
Chapter 6Fast Retransmit and Fast Recovery
• Fast retransmit strategy does not eliminate all coarse-grained timeouts.
• This is because for a small window size there will not be enough packets in transit to cause enough duplicate ACKs to be delivered.
94
Chapter 6Fast Retransmit and Fast Recovery• Fast Recovery
– Fast Retransmission signals congestion.• due to lost packets
– With fast recovery:• The sender, instead of returning to Slow Start uses a pure
AIMD.– Why?
» Slow start reset CW =1 and rapidly increase to threshold. Unnecessary.
• i.e. sender simply reduces the congestion window by half and resumes additive increase.
• Thus, recovery is faster -- this is called Fast Recovery.
95
Chapter 6
TCP Reno• The version of TCP wherein fast retransmit
and fast recovery are added in addition to previous congestion control mechanisms is called TCP Reno.
– Has other features:• header compression (if ACKs are being received
regularly,omit some fields of TCP header).• delayed ACKs -- ACK only every other segment.
96
Chapter 6
• Where are we ?• We are done with Section 6.3.• We now move on to looking at more
sophisticated congestion avoidance mechanisms.
97
Chapter 6Why Congestion Avoidance?• TCP does congestion control:
– Reactive approach:• Let congestion occur, then control
– TCP need to create losses to determine available bandwidth:
• TCP increases load to determine when congestion occurs and then backs off.
• Packet losses used to determine congestion.
– This is costly : Causes packet losses
98
Chapter 6Congestion Avoidance Mechanism• Can we do better?
– Avoid congestion ?
– Prevent losses?
– Need a proactive approach!• Can we predict the onset of congestion ?
– If so, we can reduce the sending rate just before packets dropped.
99
Chapter 6Congestion Avoidance Mechanism• We introduce 3 congestion avoidance mechanisms:
1. DECbit
2. Random Early Detection (RED)
3. Source based Congestion Avoidance
100
Chapter 6
• Router Based Congestion Avoidance:– DECbit , RED– Put additional functionality into router to assist end nodes
determine congestion
• Host Based Congestion Avoidance:– Source Based Congestion Avoidance– Avoid congestion purely from end nodes.
101
Chapter 6DECbit• Evenly splits the responsibility of congestion control
between end hosts and routers.
• At Router:
– Router monitors congestion
– It explicitly notifies end-host when congestion is about to occur
– How to notify?
• Each packet has a “Congestion Notification” bit called the DECbit in its header.
• If router is congested, it set the DECbit of packets that flow through it.
102
Chapter 6DECbit• At Destination end-host:
– The notification reaches the destination
– Destination copies the bit in the ACK that it sends the source.
• At Source end-host:
– In response, the source adjusts transmit rate.
103
Chapter 6Question• How does a router monitor congestion?
– When does router sets DECbit of packets flowing through it?
104
Chapter 6DECbit• Router sets DECbit of packets if its average queue
length is > = 1 at the time the packet arrives
• Therefore, Criterion for congestion:– average queue length of router 1 packet
105
Chapter 6DECbit
busy idle
last busy+idle cycle Current busy cycle
• How to measure average queue length?– Queue length is averaged over a time interval that spans the
last busy+idle cycle, plus the current busy cycle. – Busy: router send data– Idle: router not send data
106
Chapter 6DECbit
• Source adjusts rate to avoid congestion.
– Counts fraction of DECbits set in each congestion window.
– If <50% set, increase rate additively (increase by 1).
– If >=50% set, decrease rate multiplicatively (decrease by 0.875th).
107
Chapter 6Random Early Detection (RED)• RED is based on DECbit:
– Each router monitors its queue length
– When a router detects that congestion is about to happen, it notify the source to adjust its congestion window.
• RED implicitly notifies sender by dropping packets
– Source notification: timeout/duplicateACK
– So, it works well with TCP
108
Chapter 6Random Early Detection (RED)
• Early Detection:
– Router drops few packets before queue is full.
– Notify source to slow down sooner than it would normally have.
– So , router does not have to drop lots of packets later on.
109
Chapter 6Random Early Detection (RED)
• When to drop a packet?• What packets to drop?
• Answer: Early Random Drop:• Drop each arriving packet with some drop probability
whenever the queue length exceeds some drop level.
• Drop probability is increased as the average queue length increases.
110
Chapter 6Random Early Detection (RED)• Computing average queue length:
– Using a weighted running average:• AvgLen = (1 − ) × AvgLen + × SampleLen
• : a Weight between 0 and 1• SampleLen: queue length when a sample measurement is made.
111
Chapter 6Random Early Detection (RED)• Average queue length captures notion of congestion (long-lived
congestion) better than an instantaneous measure.
Bursty traffic (queues become full fast/ get empty fast. )
Using this value for query length In deciding congestion is bad!
(it is only a small burst of traffic.Queue is not so full other times.)
Short term changes are filtered out
112
Chapter 6Random Early Detection (RED)• RED has two queue length thresholds that trigger certain activity:
– MinThreshold– MaxThreshold
• When a packet arrives at the router, RED compares the current AvgLen with these two thresholds, according to the following rules:
• if AvgLen MinThreshold queue the packet
• if MinThreshold < AvgLen < MaxThreshold calculate probability P drop the arriving packet with probability P (random drop)
• if MaxThreshold AvgLen drop the arriving packet
113
Chapter 6Random Early Detection (RED)• Drop probability P :
– A function of both AvgLen and how long it has been since the last packet was dropped.
• P is computed as follows:
TempP = MaxP × (AvgLen − MinThreshold)/(MaxThreshold − MinThreshold)
P = TempP/(1 − count × TempP)
• Note: Above calculation assumes queue size is measured in packets.
Denote time elapsed since last packet dropped
114
Chapter 6Random Early Detection (RED)if MinThreshold < AvgLen < MaxThreshold:
TempP = MaxP × (AvgLen − MinThreshold)/(MaxThreshold − MinThreshold)
P = TempP/(1 − count × TempP)
RED thresholds on a FIFO queue
Drop probability function for RED
TempP
Maximum drop probability
Number of queued packets since last drop.count P
Extra step : P helps space out drops
115
Chapter 6Question• Why spacing out packet drops necessary?
116
Chapter 6Answer• If you take TempP as dropping
probability, packet drops were not well distributed in time (occur in clusters)
• Why dropping clusters of packets bad?–Packet arrivals from a certain connection are likely to arrive in bursts (clusters)–Clustering of drops can cause multiple drops in a single connection.
TempP
Packet burst in a connection can causeAvgLen of queue to be in this range
Most packets dropped will belong to connection
117
Chapter 6Properties of RED• Drops packets before queue is full
– In the hope of reducing the rates of some flows
• Drops packet in proportion to each flow’s rate– High-rate flows have more packets– Hence, a higher chance of being selected for dropping
• Drops are spaced out in time– Which should help desynchronize the TCP senders
• Tolerant of burstiness in the traffic– By basing the decisions on average queue length
117
118
Chapter 6Problems With RED• Hard to get tunable parameters just right
– How early to start dropping packets?– What slope for increase in drop probability?– What time scale for averaging queue length?
• RED has mixed adoption in practice– If parameters aren’t set right, RED doesn’t help– Hard to know how to set the parameters
118
119
Chapter 6Source-based Congestion Avoidance
• Host watch for some sign from the network that:– some router’s queue is building up and – congestion will happen soon if nothing is done about it.
• How does an end host know router’s queue is building up?– Host notice :
• a measurable increase in the RTT for each successive packet it sends.
• Sending rate flattens
120
Chapter 6Source-based Congestion Avoidance
• Example Algoritm1:
Increase congestion window normally (like TCP) For every two RTT delays: If current RTT> Average( min RTT seen so far, max RTT seen so
far) :decreases the congestion window by 1/8
121
Chapter 6Source-based Congestion Avoidance
• Example Algoritm2 (similar): – Updating current window size is based on:
• Changes to RTT AND • Changes to the window size.
For every two RTT delays: Calculate (CurrentWindow − OldWindow)×(CurrentRTT −
OldRTT) If the result is positive: the source decreases the window size by 1/8 If the result is negative or 0: the source increases the window by one maximum
packet size.
122
Chapter 6Quality of Service (QoS)• What is QoS?
– Providing guarantees/bounds on various network properties:
• Available bandwidth for flows
• Delay bounds
• Jitter (variation in delay)
• Packet loss
123
Chapter 6Quality of Service (QoS)• Internet currently provides one single class of
“best-effort” service– No assurances about delivery
• Most existing applications :– E.g. mutimedia applications
– Tolerate delays and losses
– Can adapt to congestion
– Can use retransmissions to make sure data arrive correctly.
124
Chapter 6Quality of Service (QoS)• Some “real-time” applications:
– E.g. teleconferencing
– Need assurance from network data arrive on time
– Using retransmissions are not OK: adds latency
– Both end hosts and routers are responsible for timeliness of data delivery
– Therefore, best-effort delivery is not sufficient for real-time applications
125
Chapter 6Quality of Service (QoS)• What we need is a new service model:
– Applications can request higher assurance from network
– Network treat some packets differently from others
– A network that can provide these different levels of service is often said to support quality of service (QoS).
126
Chapter 6Quality of Service (QoS)• Question:
– Doesn’t the Internet already support real-time applications?– So why a new service model?
• Answer:– Yes.– E.g. Skype (Voice/video over Internet)– Seem to work OK– Why? Best-effort service is often quite good.– However, if you want a reliable service for real-time
applications, best-effort isn’t good enough.
127
Chapter 6Application Requirements• What are the different needs of applications?• Applications are of two types:
– Non real-time/ Elastic Applications– Real-time/ nonElastic Applications
128
Chapter 6Application Requirements• Non real-time/ Elastic Applications:
– Traditional data applications
– E.g. telnet, FTP, email, web browsing etc.
– Can work without guarantees of timely delivery of data.
– Can gracefully tolerate delay and losses:
• They do not become unusable as delay increases
• Nice if data arrives on time, if not, it is still usable.
– Delay requirements vary by application:
• telnet (very interactive : low delay required)
• FTP (interactive bulk transfer)
• Email (least interactive: delay tolerable)
129
Chapter 6E.g. Audio application• Real-time Applications:
– If a data packet is delayed, it is unusable.– E.g. Audio application:
Data: audio samples
Analogdigital conversion
Digital samples are placed in packetsand transferred over the network
Digital samples are received at other endand converted to analog signals
Audio samples are played back at some appropriate rate
130
Chapter 6E.g. Audio application
• Audio Application:
– Each audio sample has a particular playback time:
– Played back rate = voice samples collection rate
– E.g.
• Sample collection rate at receiver= 1 sample per 125 s
• Playback time of a sample = 125 s later than the preceding sample
– What happens if data arrives after appropriate playback time (delayed/retransmitted) ?
• Packet is useless
131
Chapter 6E.g. Audio application• How to make sure packets arrive in time?
– Obviously, different packets in an audio stream may experience different delays
• Why?
– Packets queued in switches and routers
– Queue lengths vary over time
– So, they may not arrive at expected time.
• Solution: Introduce a playback buffer
132
Chapter 6E.g. Audio application• Playback buffer:
– Receiver buffer up some incoming data in reserve
– Store of packets waiting to be played at right time
– Playback point:• Add a constant delay offset to playback time
– Short packet delay: OK• Packet goes on buffer until playback time arrives
– Long packet delay: OK• Packet goes in buffer (assuming a non empty queue). Played soon.
– Extreme Long packet delay: NOT OK• Packet arrives after playback time
133
Chapter 6
• Operation of a playback buffer
E.g. Audio application
From: YouTube
Already played at client/receiver
Playback point
Playback buffer
134
Chapter 6• Operation of a playback buffer
Sending time
Receiving time
Playback time
data
Server/senderClient/receiver
135
Chapter 6E.g. Audio application• How far can we delay playback of data?
– For audio applications it is 300 ms.
– i.e. maximum time delay between when you speak and a listener hears to carry on a conversation.
– Application want the network to guarantee that all its data arrives within 300ms.
136
Chapter 6E.g. Audio application
Delay measured over certain paths of the Internet
97% of packets have a Latency of 100ms or less
If audio application set playback point 100ms or less,3 out of every 100 packets will arrive too late
Long tail. We need to set playback point over 200ms toensure that all packets arrived in time
137
Chapter 6Taxonomy of Real-Time Applications• Let us look at different classes of applications that serve to
motivate our service model
138
Chapter 6Taxonomy of Real-Time Applications• Classification1: Based on Tolerance of occasional loss of data
1. Tolerant Applications: (E.g. Audio Applications)
• A packet loss:
– normal packet loss in network AND
– Packet arriving too late to be played back
• Small fix (occasional loss):
– Interpolate the lost sample by surrounding samples (little effect on audio quality)
• If more and more samples lost:
– Voice quality declines speech is incomprehensible
139
Chapter 6Taxonomy of Real-Time Applications
2. Intolerant Applications: (E.g. Robot Control Program)
• Command sent to robot arm to reach it before it crashes into something
• Losing a packet is unacceptable
– Note that many real time applications are more tolerant of occasional loss than many non-real-time Applications
• E.g. Audio VS FTP (loss of 1 bit file completely useless)
140
Chapter 6Taxonomy of Real-Time Applications• Classification2: Based on adaptability to delay
1. Delay-adaptive Applications: (E.g. Audio Applications)
• Application can adapt to amount of delay packets experience in traversing the network
• E.g.
– Application notices that packets are almost always arriving within 300ms of being sent
– It therefore set playback point accordingly, buffering any packets that arrive in less than 300 ms.
141
Chapter 6Taxonomy of Real-Time Applications
– Suppose that application subsequently observe that all packets are arriving within 100 ms of being sent.
– If application adjust playback point to 100 ms:
» users of the application would probably perceive an improvement.
» require us to play samples at an increased rate for some period
– We should advance playback point only when:
» We have a perceptible advantage AND
» We have some evidence that no. of lost packets are acceptably small
142
Chapter 6Taxonomy of Real-Time Applications
2. Rate Adaptive Applications:
– Another class of adaptive applications
– E.g. many video coding algorithms can trade off bit rate versus quality.
• A video is a finite sequence of correlated images. • A higher bit rate accommodate higher image quality in the video• Video coding = image compression + temporal component.
– If the network can support a certain bandwidth, we can set our coding parameters accordingly.
– If more bandwidth becomes available later, we can change parameters to increase the quality.
143
Chapter 6Approaches to QoS Support• We need a richer service model than best-effort that meets the
needs of any application
• This leads us to a service model with several classes, each available to meet the needs of some set of applications.
• Approaches have been developed to provide a range of qualities of service
• Two categories:
1. Fine-grained approaches
2. Coarse-grained approaches
144
Chapter 6Approaches to QoS Support
1. Fine-grained approaches:
• Provide QoS to individual applications or flows
• Integrated Services:
– Often associated with the Resource Reservation Protocol (RSVP)
2. Coarse-grained approaches:
• Provide QoS to large classes of data or aggregated traffic
• Differentiated Services:
– Probably the most widely deployed QoS mechanism
145
Chapter 6Integrated Services (RSVP)• Integrated services (Intserv):
– A body of work IETF produced around 1995-1997
– Defines:
• A number of service classes that meet needs of some application types.
• How RSVP could be used to make reservations using service classes.
146
Chapter 6Integrated Services (RSVP)• Service Classes
1. Guaranteed Service:• Designed for intolerant applications.• Applications require packets never arrive late.• The network should guarantee that the maximum delay that any packet will
experience has some specified value
2. Controlled Load Service:• Designed for tolerant, adaptive applications• Applications run quite well on lightly loaded networks• Service should emulate a lightly loaded network for those applications that
request the service, even though the network may in fact be heavily loaded
147
Chapter 6Integrated Services (RSVP)• Overview of Mechanisms (Steps):
1. Real-time application provide Flowspecs to network:• Flowspecs specify the type of service required
2. Network perform admission control:• Network decide if it can provide that service. • Admission Control: process of deciding when to say no.
3. Network users/components perform resource reservation:• Users and components of the network exchange information:
– requests for service, flowspecs, admission control decisions. • Done using RSVP
4. Switches/Router perform packet scheduling:• Network switches and routers meet the requirements of the flows by
managing the way packets are queued and scheduled for transmission.
148
Chapter 6Integrated Services (RSVP)
1. Real-time application provide flowspecs to network
– Flowspec has 2 parts:
• TSpec: The part that describes the flow’s traffic characteristics
– E.g. bandwidth used by the flow » not always a single number per application» A video application generate more bits per second when the scene is
changing rapidly than when it is still» Sending average bandwidth is insufficient
• Rspec: The part that describes the service requested from the network.
– E.g. target delay bound (guaranteed service), No additional parameter (controlled load service)
149
Chapter 6
Integrated Services (RSVP)2. Network perform admission control:
– Admission control looks at the TSpec and RSpec of the flow.
– Then it tries to decide if the desired service can be provided to that amount of traffic requested, given the currently available resources, without causing any previously admitted flow to receive worse service than it had requested.
– If it can provide the service, the flow is admitted.
– if not, then it is denied.
150
Chapter 6Integrated Services (RSVP)
3. Network users/components perform resource reservation (RSVP):
– Connectionless networks like the Internet have had no setup protocols.
– However real-time applications need to provide a lot more setup information to network.
– Resource Reservation Protocol (RSVP) is one of the most popular setup protocols.
151
Chapter 6Integrated Services (RSVP)
– Connectionless nature of Internet is robust.
• Why?
– Because connectionless networks rely on little or no state being stored in the network
– Routers can crash/reboot and links can go up/down while end-to-end connectivity is still maintained.
• RSVP tries to maintain this robustness by using the idea of soft state in the routers.
• RSVP also support multicast flows just as effectively as unicast flows.
152
Chapter 6• E.g. Unicast. One sender and one receiver trying to get a
reservation for traffic flowing between them
Integrated Services (RSVP)
Making reservations on a multicast tree
153
Chapter 61. Sender sends a path message
TSpec to receiver
• This informs receiver the type of traffic and path used by sender
• This help receiver making appropriate resource reservation at each router on the path
2. Each router on path looks at PATH message and figures out the reverse path that will be used to send reservations from the receiver back to the sender
Integrated Services (RSVP)
Making reservations on a multicast tree
PATHSender Tspec
154
Chapter 63. receiver sends a reservation back “up”
the multicast tree in a RESV message
4. Each router on the path looks at the reservation request and tries to allocate the necessary resources to satisfy it.
5. If the reservation can be made, the RESV request is passed on to the next router.
• If not, an error message is returned to the receiver who made the request.
Integrated Services (RSVP)
RESV
Sender TspecReceiver RSpec
• If all goes well, the correct reservation is installed at every router between the sender and the receiver.
• As long as the receiver wants to retain the reservation, it sends the same RESV message about once every 30 seconds
155
Chapter 6Integrated Services (RSVP)
4. Switches/Router perform packet scheduling:– Finally, routers deliver the requested service to the data
packets.
– There are two things that need to be done:
1. Classify packets:
– Associate each packet with the appropriate reservation so that it can be handled correctly.
2. Schedule Packets:
– Manage the packets in the queues so that they receive the service that has been requested
156
Chapter 6
• Approaches have been developed to provide a range of qualities of service
• Two categories:
1. Fine-grained approaches
• Integrated Approaches(RSVP)
2. Coarse-grained approaches
• Differentiated services
157
Chapter 6Differentiated Services (DiffServ)• Integrated Services architecture:
– allocates resources to individual flows
• Differentiated Services model:
– Allocates resources to a small number of classes of traffic.
158
Chapter 6Differentiated Services (DiffServ)
• E.g. Add one new class, “premium.”
• How to identify packets as premium or not?
– Use a bit in the packet header:• bit=1 premium packet• bit=0 best effort packet
159
Chapter 6Differentiated Services (DiffServ)
• Who sets the premium bit, and under what circumstances?
160
Chapter 6Differentiated Services (DiffServ)
• Answer:
– Many possible answers
– A common approach is to set the bit at an administrative boundary.
– E.g. the router at the edge of an ISP’s network might set the bit for packets arriving on an interface that connects to a particular company’s network.
– Why?
• The Internet service provider might do this because that company has paid for a higher level of service than best effort.
161
Chapter 6
• What does a router do differently when it sees a packet with the bit set?
162
Chapter 6Differentiated Services (DiffServ)
• Answer:– IETF standardized the behavior of routers.
• Called “per-hop behaviors” (PHBs)
– The Expedited Forwarding (EF) PHB:
• One of the simplest PHBs
• If a packets is marked for EF treatment, it should be forwarded by the router with minimal delay and loss.
• How to guarantee this?
– Router makes sure arrival rate of EF packets at the router < rate at which the router can forward EF packets.
163
Chapter 6Question• True or False?• Suppose host A is sending a large file to host B
over a TCP connection. The number of unacknowledged bytes that A sends cannot exceed the size of the advertised receiver buffer.
164
Chapter 6Answer• TRUE. • TCP is not permitted to overflow the allocated
receiver buffer. • Hence when the sender can not send any more
data ReceiverWindow would be 0• So, all the buffer would have unacknowledged
data.
165
Chapter 6Question• TCP waits until it has received three duplicate
ACKs before performing a fast retransmit.
• Why do you think the TCP designers chose not to perform a fast retransmit after the first duplicate ACK for a segment is received?
166
Chapter 6Answer• Packets can arrive out of order from the IP layer. • So whenever an out of order packet would be
received it would generate a duplicate ACK– if we perform retransmission after the first duplicate
ACK it would lead the sender to introduce too many redundant packets in the network.
167
Chapter 6
Quiz This Friday (10/24)!