PS2 is out, due Oct 15 Email [email protected] by [email protected] –your project team-list –bi-weekly meeting (wed 5-6pm, fri 5-6pm) No class.

• PS2 is out, due Oct 15

• Email [email protected] by tomorrow– your project team-list– bi-weekly meeting (wed 5-6pm, fri 5-6pm)

• No class next Monday, proposal due

mailto:[email protected]

Congestion control & Resource management

It is important to control load

• Scenario: ISP buys an expensive cross country link– Underload: idle link, waste of money– Overload: equally bad…

• Buy a fatter “pipe”?– Expensive; pipe is never fat enough– How to measure load?

Queues smooth bursty traffic

• Data traffic is bursty• What if avg load < capacity < peak load?• Queues smooth load

– Buffer grows when load > capacity– Buffer drains when load < capacity

• Is a bigger queue always better?

Bad things could happen upon overload …

• Scenario: Load >> capacity

• Router queue grows– Long queuing delay

• Sender retransmits after incurring timeout• Retransmissions make the queues longer• Useful throughput --> zero

Congestion collapse

goodput

Avg offered load

Queues start building up

Congestion collapse

Link utilization increases with load

Our goal

goodput

Avg offered load

Stay here!•Low queuing delay•High utilization•No danger of collapse

Congestion control strategies

• End-to-end (TCP)– Routers give minimal feedback

• e.g. drops with FIFO queue

– Sources control load based on feedback

• Router-assisted

A strawman

4 pkt/sec

R1 R2

10 pkt/sec

• Host A sends a packet, waits for an ACK, sends a packet etc.• How fast does A send when both A->B and C->D flows are active?• How fast does A send when C->D goes away?• Ack “clocks” packet transmission (conservation of packets)

AB

C D

Strawman under-utilizes link with non-negligible RTT

• RTT 0.5s, link speed 4 packets/sec• Strawman’s throughput?

– 1/0.5 = 2 pkts/sec << link speed

AB

Control send rate w/ window size

• Allow w un-acknowledged packets in flight

• Send rate: w /RTT

• Change w to:– Keep long-delay networks busy– Adjust send rate to available b/w

What is the right w?

• Let the network tell end hosts?– Not always possible, e.g. cable modem is not

an IP router

• Try different w’s and see which is best?

• Can we use the same w over time?

Try some w, adjust as we go

• When should we increase w?– If current w works well, try a bigger one

• When should we decrease w?– When we hit link capacity (i.e. see drops)– Can we squeeze more b/w out of our

competitors?

How to adjust w?

• Simple ways to increase/decrease w– MI: Increase w by 10%– AI: Increase w by 10 pkts– MD: Decrease w by 50%– AD: Decrease w by 10 pkts

AI or MI, AD or MD?

• Some intuitive arguments– AI: too slow at high speed, too fast at low speed

– MI: scales better with link speed

– AD: does not slow down fast enough, esp. when another connection starts

– MD: when overload, queues grow exponentially, MD also cuts down load exponentially.

Why AIMD? Goals of congestion control

• Scalable– Should work well in any situations

• Efficient– High link utilization w/o danger of congestion collapse

• Fair– Ideally, all users obtain equal share

• Distributed operation• Convergence

Why AIMD? [Chiu,Jain 89]

User1’s b/w

User2’s b/w

Optimal fairnessx1 = x2

optimal efficiencyx1+x2 = link speed

Our goal: optimal fairness and efficiency

MIMD does not converge to fairness

User1’s b/w

User2’s b/w

AIMD converges to optimal fairness and efficiency

TCP

• At equilibrium: AIMD– AI: w = w+1– MD: w = w/2 upon loss

• To bootstrap: – slow start: w = 2w

• How to use ack to “clock” adjustments to w?– Upon receiving an ack: w += 1/w

Evolution of TCP’s congestion windowW

Wmax/2

Wmax

time

How much buffer space is needed?

• Bigger buffer is not always good– Expensive and results in longer delay

• Smaller buffer is also not good– Might not fully utilize the link

• At any time, – Q + b/w * RTT =

€

wii

∑

Homework question: calculate buffer space required for 100% utilization

Loss rate vs. TCP throughput

= O( )

Wmax/2

Wmax

€

1

rtt p

In one epoch, TCP sendswmax/2 + (wmax/2 + 1) + … wmax ≈ pkts

So, loss rate p =

€

3

8wmax2

€

8

3wmax2

Avg window is(wmax/2 + wmax)/2 =

So, throughput =

€

3

4wmax

€

3wmax4 • rtt

Loss rate vs. # of connections

• TCP uses loss as a signal to adjust rate

• N connections, each gets share of b/w

€

=O(b

n) =O(

1

rtt p)

€

p =O(n2) €

1

n

Some concrete numbers

• A very slow modem with 8 pkts queue

• TCP window varies from 4 to 8 pkts

• Loss rate? 1/(4+5+…+8) = 3.3%

• A second TCP starts, each TCP window varies from 2 to 4 pkts

• Loss rate? 1/(2+3+4) = 11% !!

TCP-friendly congestion control

• Not all protocols use window-based congestion controls

• To compete fairly with TCP, one must send at rates

€

1

p

Congestion control & resource management

• End-to-end (TCP)

• Router-assisted resource management– Explicit congestion notification– Active queue management

Why add “smarts” to routers?

1. Congestion avoidance instead of control– Routers have the most information about congestion

2. Fair queueing– Isolation

• End-to-end schemes cannot prevent a mishaving flow from affecting others

– Quality of Service (QoS)• Ensure desired delay, rate, jitter etc.

Routers’ “bag of tricks”

• Congestion signaling– Tell sources if congestion occurs (about to

occur), or how much to send

• Buffer management– Decide which packets to drop/mark

• Scheduling– Which packets get priority to send over others

#1 Better congestion control w/ routers’ help

• Can routers alone solve congestion control?– No!– End points must still curb send rate based on

router feedback

Congestion signalling

• Packet drop– One bit information (binary)

• Marking packets– Set one or more bits in header, echoed by

ACKs

RED (random early detection)[Floyd, Jacobson 93]

Drop-fail queues drop packets from multiple connections in burst• all connections cut w simultaneously,

resulting in link underutilization

• Key idea: • Drop/mark packets before queue is full• When min< qa< max, mark packets with pa

Avg queue length,calculated using EWMA

Varies between 0 and increases as qa increases

€

pmax

XCP [Katabi et al, 02]

• Key ideas:– router explicitly allocates rate for each flow– Separate efficiency control from fairness control

Summary: congestion control techniques

• Drop-tail queue + TCP– Binary feedback when congestion happens

• RED + TCP– Binary feedback before congestion

• XCP– Precise feedback (many bits of information)

What if end hosts cheat?

• Cheating senders

• Cheating receivers

Why add “smarts” to routers?

1. Better congestion control

2. Fair queueing

Why scheduling?

• Fairness– Drop-tail (FIFO)/RED do not guarantee fairness

among flows– Abusive flows get more b/w by sending more

• Differentiated service– Different flows have different requirements– Real time video conference desires low delay, low

jitter

Anatomy of a router

CheckIPHeader

LookupIPRoute

Classifier Classifier

Scheduler Scheduler

ToDevice(eth0) ToDevice(eth1)

Classify packets according to ToS field, individual connection or aggregate flows

Buffer management: whether to drop/mark packets

Which queue and which of the packets to send next

A simple fair scheduler: round-robin (RR)

• How? – Serve packets in round-robin order, skip

empty queues

• If a flow sends at faster than 1/n rate?

• If a flow sends less than 1/n rate?

• RR tries to achieve max-min fairness

Is RR really fair?

• What if packets have different sizes?

• We want fairness at bit-level

Bit-level round robin

• Idea: – assign round # to start and finish of a packet

according to bit-level RR– Send packet with the smallest finish round #

S:0F:20

Bit-level RR example

S:0F:1460

S:0F:100

Current round = 0

S:100F:600

20

S:20F:520

1005206001460

FQ is too complex for fast routers

• Lots of state– 100,000 flows are common in core routers

• High speed processing– 50ns to process each packet for a 10Gbps link

• Only reasonable to implement FQ at low speed, at edge routers

Core stateless fair queueing(CSFQ)

• Stoica et. al. (suggested reading list)

Edge routers keep track of per-flow state: mark input ratesin packet header (dynamic packet state)

Core routers are stateless: emulate FQ by dropping a flow’s packets with the right probability according to its input rate

Summary

• Congestion control– End-to-end + FIFO queue– End-to-end + router assisted

• Fair queueing– Forcing (giving incentives to) endpoints to

send at fair share

PS2 is out, due Oct 15 Email [email protected] by [email protected] –your project team-list –bi-weekly meeting (wed 5-6pm, fri 5-6pm) No class.

Documents

w rttchange w

right w

current w

load capacityis

avg load capacity peak

link capacity

fifo queuesources control

idle link